Semantic Scholar is a free searching platform created by the Allen Institute for AI in 2015. The “highlight” of this service is the deep semantic data mining technology that has expanded the functions of an ordinary database. The searching algorithm does not only provide a list of results but also summarizes each text in a couple of sentences, which makes it easier to select the information.
How does it work and how can you get access to it
The database indexes the metadata of ~200 million publications, which is several times more than in WoS and Scopus. Then the AI “enriches” the data, that is, it extracts everything essential from the document in order to present it to the user in the most comprehensive form: a PDF file, a bibliographic description in different standards, information about authors and sources, a summary. The output is a knowledge graph, that is, a semantic web that stores the information about different entities and the connections between them.
Illustration of the Semantic Scholar Platform (Semantic Scholar Open Data Platform, 2015)
Unlike The Lens and Dimensions, which operate on a freemium model, Semantic Scholar is not a commercial project, which means all its functions are available 100% free of charge.
Search in Semantic Scholar
To start your search, enter a keyword or a phrase in the search field. The system automatically provides the suggestions for the articles that may be of interest to the user: a semantic analysis algorithm generates a result based on the frequency of the phrase used in the text. If you want to see a complete list of the relevant sources, click on “search”.
The search results can be specified by the field of scientific knowledge, the period of coverage, the author, the availability of access to the full text, and the journal in which the text was published. You can also classify documents by their relevance, citation count and influence, or by their recency.
The article is summarized in the abstract. However, if the latter seems too long, you can trust the AI that will provide a couple of short sentences to cover the main idea..
NOTE: TLDR - too long; didn't read is an acronym that means that a text has been ignored because of its verbose.
Semantic Scholar also displays some information about the paper citation, sorted by year.
The bibliographic description of a document can be copied or exported to the bibliographic manager via the “cite” button.
Extended document data
You can proceed to the further information about a certain document.
The most important quantitative indicators that will help assess the “weight” of the paper are displayed on the right. The system also calculates the number of the most influential citations rather than only the total number it has been mentioned in the other works.
Note: Highly influential citations are determined by the internal algorithm using a machine learning model. It analyzes a range of factors, including the number of citations the paper received and the related context for each reference.
A significant advantage of Semantic Scholar is its cross-search opportunities. It means the following. There are works referenced by the author of the paper considered as well as some works whose authors have referred to it. t Thus, following the chain of citations, the user can trace the continuity of ideas in an academic field.
You can also see the related works selected by AI in a separate tab. A short description is given for each document.
Search by author
Enter the author’s name in the search field and select the person you need. You can use filters to advance the result.
The data from the author’s profile and their organization provide the most complete picture of the researcher’s collaborations with other members of the academic community: separate tabs present the authors whose papers have been cited by the scientist themselves, those who have cited their papers, and also a list of co-authors. The papers can be searched there in the same way, as if the author’s profile was a separate database. You can filter their most influential works, and thus quickly perceive the researcher's contribution to science.
One glance at a publication appears to be enough to assess its “weight”: the total number of citations and the highly influential ones are displayed under the author’s name.
The ORCID and the most significant quantitative indicators of the work according to Semantic Scholar are presented on the left, under the name of the scientist: the number of papers, the H-index, the total number of citations and the number of the most influential ones. You can follow the leading experts by setting up an alert. Notifications about new papers will be sent to the email address indicated.
Advantages and disadvantages of Semantic Scholar
The main advantage of the platform is the huge coverage of papers and free access to all functions, which are much more diverse than in any other open database. ALL open databases have the same disadvantage: the priority of quantity over selectivity. That is, the quality of the works found will have to be assessed by the searcher themselves. In this sense, even the presence of citations of an article cannot be an indicator of authority.
Cross-citation and collaboration tracking between scientists explores the contribution of the most influential scientists in the field and creates a list of current topics for potential research. However, please note that the data found is based on the metadata from OPEN sources. The author can also add papers to the profile. No one will check the quality. Therefore, the author’s profile cannot reflect an objective image of the scientist’s contribution.
Semantic Scholar is a tool to help a scientist who will use the platform's features while consulting the data from authoritative databases. For example, they can check the indexing of the journal in which the article was published using Scopus free functions or compare the indicators of a scientist by profile in Web of Science.