Verifying a person's PhD
This document describes how Fixpoint's PhD verification works. It covers the criteria for how Fixpoint determines:
- if a research publication was written by a given author
- if a person attended a school for a particular degree
- if a person graduated from a school with a particular degree
General resolution categories
Whenever we make a decision about whether a paper, education, etcetera is
relevant to a user, we set a
ResolutionConfidence (opens in a new tab).
When we attach a ResolutionConfidence to something like a research
publication, we call the publication the “entity” and the person containing the
publication the “parent”.
It includes these fields:
- confidence - how confidently we resolved a paper or degree to the person
- reason - a human-readable reason for the resolution
- reason_code (optional) - a machine-parseable reason code
- details (optional) - more complex machine-parseable JSON for the resolution, such as the sources used to verify an education
Confidence can be one of these levels:
- high: system is extremely confident that the entity can be resolved to its parent
- medium: we are mostly confident that the entity can be resolved to its parent, but we have reasons preventing full confidence
- low: the entity might resolve to its parent, but we have either not run enough checks to establish confidence or are the checks instilled only very minimal confidence. You should NOT rely on this linkage without running more checks yourself.
- unverified: entity link was claimed, but we could not verify it
- absent: the entity is either empty or does not have enough information to even attempt resolution to its parent
Papers and publication resolution
Two main ways:
- We find a person’s academic research author profile and then crawl papers listed there
- We find individual papers
Author profiles
We do web search and research to find a Google Scholar profile, ORCID profile, etc. If the profile site has its own method of verifying that listed papers belong to the author, we mark all found papers as “high” confidence. Otherwise, we save the paper titles and URLs and process them via our “individual papers” workflow (see below).
To ensure we found the correct author profile, we verify that organizations or field of study from the author profile matches up with information on the resume.
Individual papers
We run multiple searches to find papers that might be attributable to an author. For every paper we find, we run them through our entity resolution confidence process. Here is how we classify confidence:
- High:
- found a paper listed on their resume and verified from the paper an author name match
- found the paper independently and verify an author name and organization match
- Medium:
- Found the paper independently and verified an author name match only
- Low:
- Papers returned via web research process, but could only verify partial name match (ie just last name)
Education resolution
We verify two things, separately:
- Did the person begin a degree at a given school? (enrollment)
- Did the person complete their degree? (graduation)
Fixpoint starts education verification by first running multiple web research agents, which search and crawl the web to find information about a person’s education. For example, you might be doing an education check on Dylan Mikus. His resume says:
- Carnegie Mellon University
- Graduated with a Bachelors in computer science in 2014
Fixpoint runs multiple web searches for, and then collects every website discovered and checks:
- Is it about Dylan Mikus?
- Is it about his bachelors degree in CS?
- Does it claim he enrolled in Carnegie Mellon University?
- Does it claim he graduated from Carnegie Mellon University?
For each website, we verify if the website is credible. For example, a university news article, a cited research paper, or conference proceedings are credible. LinkedIn posts are not.
Here’s the different ways we verify and their reason codes (opens in a new tab):
VERIFIED_BY_EDU_SITE: found evidence from an education institution's websiteVERIFIED_BY_INDUSTRY_SITE: found evidence from an industry (ie company) websiteVERIFIED_BY_LAB_SITE: found evidence from a lab websiteVERIFIED_BY_PROOF_OF_POSTDOC: found evidence from a proof of postdocVERIFIED_BY_GOVERNMENT_SITE: found evidence from a government websiteVERIFIED_BY_PUBLICATION: found evidence from an academic publication. Does not imply graduation, only enrollment.VERIFIED_BY_THESIS_FOUND: found evidence from an academic thesis. Implies graduation.VERIFIED_BY_NEWS_SITE: found evidence from a news websiteVERIFIED_BY_GOOGLE_SCHOLAR: found evidence from a Google Scholar profile. Only verifies enrollment, not graduation.
Research papers:
- If we find high confidence research papers attributed to an EDU institution:
- papers from institution → enrolled
- thesis from institution → graduated
Sites we special-case include or exclude
Exclusions
We purposefully do not allow claims from these sites:
- Social media sites - Quora, LinkedIn, etc.
However, these sites can link to other sites that then corroborate the person’s education.