Facebook and LinkedIn have two of the largest and richest APIs for information about people. Their data is submitted by users, updated regularly and includes personal details and relationships. Unfortunately you can only access those parts of the social graph that your user has access to.
At Seravia, we have taken a different approach to finding information about people and their relationships. We have collected over 200 million government documents and mined these to identify millions of people, their roles, their skills, their various names and their relationships to both people and institutions. These details are often the sort that are not openly published on social sites and outside your users immediate social graph.
We are now making these details available for use in other applications or data research projects through an easy-to-use Person API.
Overview
Let’s say you wanted to find out more about Jeff Dean, one of the most famous engineers at Google. Specifically you want to find variations on his name (e.g. “Jeffrey Dean”), areas of expertise (e.g., “Machine Learning”), related institutions (e.g. “Stanford University”) and publications (e.g. “System and Methods for Automatically Creating Lists”).
Using the Person API you can submit a name and, optionally, a list of affiliated companies and universities and get in return a list of skills and publications associated with the person.
Request
All of the Data APIs are RESTful and return JSON.
Domain
Each Data API uses our company domain and a dedicated “api” subdomain:
http://api.seravia.com
Version
Our Data APIs will roll out gradually and the semantics are likely to change as we add features and increase the datatypes we support. For this reason all current requests should use the “v1″ prefix:
http://api.seravia.com/v1
Action
The Person API uses a single action called “people”. This will return aliases, related companies, skills and publications for a single person.
http://api.seravia.com/v1/people
Parameters
name | Required | This is the full name of the individual you are looking for. Try to identify the exact legal name that is likely to appear on official documents, including full given name and, in some cases, middle initial. The Person API will return related names to help you find the exact one (or more) you are looking for. |
companies | Optional | This a list of any institutions (companies or universities) with which the person has been affiliated with. |
key | Required | Each application requires a key. Check here for details on getting a key. |
Note: parameters must be URL encoded, including replacing blank spaces with %20.
http://api.seravia.com/v1/people?name=Jeffrey%20Dean&api_key=XYZ
See the Examples sections for variations.
Response
The Person API returns up to six types of data related to the person name submitted. A typical response looks like:
{
"aliases": [
"Jeff Dean",
"Dean Jeffrey"],
"companies": [
"Google Inc",
"Stanford University"],
"people": [
"Matt Cutts",
"Karl Pfleger"],
"specialties": [
"electric digital data processing",
"information retrieval",
"user interface"],
"locations": [
"1600 Amphitheatre Pky, Mountain View, CA",
"Palo Alto, CA"],
"publications": [
"Retaining wall masonry block",
"System and method for reorganizing data storage in accordance with usage frequency",
"Methods and apparatus for serving relevant advertisements"]
}
Aliases
Aliases includes various other names that may be alternatives to the name given. For instance, “Jeffrey Dean” returns “Jeffrey Dean”, “Jefrey Dean”, “Dean Jeff” and “Jeff R Dean”. These variations all appear in public documents and may include misspellings, longer/short given names, inclusion/exclusion of middle name or initial and different word orders.
Each alias may be called separately to return related entities and publications associated with that name. In subsequent versions we will add the ability to include/exclude any of these aliases.
We also try to identify and score unique persons with identical names (the “John Smith” problem) using shared employers, shared agents, shared coworkers, skillsets, geolocation, time and other signals. In subsequent versions we will release
Companies
Companies includes companies or institutions such as universities with which this person has been affiliated. For instance, “Jeff Dean” returns “Google Inc” and “Stanford University”. These companies have appeared on public documents alongside the person. Company affiliations include but are not limited to:
- - assignee of a patent
- - owner of a trademark
- - company being registered
- - company under contract
- - disclosed employer on campaign contributions
- - disclosed relationship as manager, director or shareholder
- - law firm advising on a filing
Companies are not always explicitly identified as being a company. We look at entity type, role on the filing and inclusion of certain keywords in the name to determine if it is a company. These are not always correct.
Each company may then be looked up using the Company API.
People
People includes people with whom this person has been affiliated. For instance, “Jeffrey Dean” returns “Matt Cutts” and “Karl Pfleger”.
These people have appeared on public documents alongside the person. This means that it only includes people that are one degree separated from the person. It does not include second-degree relationships such as people affiliated with one or more of the affiliated companies.
Person affiliations include but are not limited to:
- - co-inventors on a patent
- - co-owners of a trademark
- - fellow manager, director or shareholder
- - lawyer advising on a filing
Specialties
Specialties include the most common technologies which with this person is affiliated. For instance, “Jeffrey Dean” returns “search”, “machine learning”, etc.
We identify and score unique skills by mining millions of publications and factoring in word frequency, patent classifications, inventors and other signals. This is a non-trivial problem and requires a large number of frequent but not useful terms to be filtered out.
Locations
Locations includes address with which this person has been affiliated. For instance, “Jeffrey Dean” returns “1600 Amphitheatre Pky, Mountain View, CA” and “Palo Alto, CA”.
These addresses have appeared on public documents as either the person’s explicit address or the primary address on the filing. This means that it only includes addresses that are one degree separated from the person. It does not include second-degree relationships such as addresses affiliated with one or more of the affiliated companies.
Most addresses have been normalized and deduplicated using longitude and latitude. In many instances the address will not include details down to the street level. In these cases it will only include city and country.
Publications
Publications are all those documents on which the person appears. The full list of document types appears here.
Examples
- http://api.seravia.com/v1/people?name=Jeffrey%20Dean&companies=(Google%20Inc,Stanford%20University)
- http://api.seravia.com/v1/people?name=Dean%20Kamen
- http://api.seravia.com/v1/people?name=Nathan%20Myhrvold
- http://api.seravia.com/v1/people?name=Jonathan%20Ive
Datasources
In the first version this information comes largely from ~55 million worldwide patents. Scholarly journals will be added in subsequent versions.
Obtaining a Key
If you are interested in advance access, please contact us at [email protected].