Peer Firm Identification Using Word Embeddings

Taeyoung Kee*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


In the task of peer firm identification, researchers have relied on existing industry classification system regardless of their critical limitations. In the existing industry classification system, a company should be categorized into one group regardless of the number of products and services it offers. Furthermore, it is not possible to measure the similarity of companies belonging to the same group. The systems are revised manually, rendering it difficult for them to keep up with the fast-changing industry landscape. In this paper, we propose a novel peer firm identification method based on Word 2Vec. By computing the cosine similarity of word embedding vectors trained on a 10-year corpus of financial news articles, we developed a method that produces peer firms with their numeric similarity scores. Our approach allows us to observe chronological changes in the peer firms by having firm words that appear in news articles from different periods in the same vector space. Last but not least, our Word 2Vec-based method produced more economically homogeneous groups of peer firms compared to the existing classification systems.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
Number of pages8
ISBN (Electronic)9781728108582
Publication statusPublished - 1 Dec 2019
MoE publication typeA4 Article in a conference publication
EventIEEE International Conference on Big Data - Los Angeles, United States
Duration: 9 Dec 201912 Dec 2019


ConferenceIEEE International Conference on Big Data
Abbreviated titleBig Data
CountryUnited States
CityLos Angeles


  • financial news
  • industry classification
  • Peer firms
  • word embedding

Fingerprint Dive into the research topics of 'Peer Firm Identification Using Word Embeddings'. Together they form a unique fingerprint.

Cite this