Person API

Facebook and LinkedIn have two of the largest and richest APIs for information about people. Their data is submitted by users, updated regularly and includes personal details and relationships. Unfortunately you can only access those parts of the social graph that your user has access to.

At Seravia, we have taken a different approach to finding information about people and their relationships. We have collected over 200 million government documents and mined these to identify millions of people, their roles, their skills, their various names and their relationships to both people and institutions. These details are often the sort that are not openly published on social sites and outside your users immediate social graph.

We are now making these details available for use in other applications or data research projects through an easy-to-use Person API.

  1. Overview
  2. Request
  3. Response
  4. Examples
  5. Datasources
  6. Obtaining a Key

Overview

Let’s say you wanted to find out more about Jeff Dean, one of the most famous engineers at Google. Specifically you want to find variations on his name (e.g. “Jeffrey Dean”), areas of expertise (e.g., “Machine Learning”), related institutions (e.g. “Stanford University”) and publications (e.g. “System and Methods for Automatically Creating Lists”).

Using the Person API you can submit a name and, optionally, a list of affiliated companies and universities and get in return a list of skills and publications associated with the person.

Request

All of the Data APIs are RESTful and return JSON.

Domain

Each Data API uses our company domain and a dedicated “api” subdomain:

http://api.seravia.com

Version

Our Data APIs will roll out gradually and the semantics are likely to change as we add features and increase the datatypes we support. For this reason all current requests should use the “v1″ prefix:

http://api.seravia.com/v1

Action

The Person API uses a single action called “people”. This will return aliases, related companies, skills and publications for a single person.

http://api.seravia.com/v1/people

Parameters

name Required This is the full name of the individual you are looking for. Try to identify the exact legal name that is likely to appear on official documents, including full given name and, in some cases, middle initial. The Person API will return related names to help you find the exact one (or more) you are looking for.
companies Optional This a list of any institutions (companies or universities) with which the person has been affiliated with.
key Required Each application requires a key. Check here for details on getting a key.

Note: parameters must be URL encoded, including replacing blank spaces with %20.

http://api.seravia.com/v1/people?name=Jeffrey%20Dean&api_key=XYZ

See the Examples sections for variations.

Response

The Person API returns up to six types of data related to the person name submitted. A typical response looks like:

{
  "aliases": [
    "Jeff Dean",
    "Dean Jeffrey"],
  "companies": [
    "Google Inc",
    "Stanford University"],
  "people": [
    "Matt Cutts",
    "Karl Pfleger"],
  "specialties": [
    "electric digital data processing",
    "information retrieval",
    "user interface"],
  "locations": [
    "1600 Amphitheatre Pky, Mountain View, CA",
    "Palo Alto, CA"],
  "publications": [
    "Retaining wall masonry block",
    "System and method for reorganizing data storage in accordance with usage frequency",
    "Methods and apparatus for serving relevant advertisements"]
}

Aliases

Aliases includes various other names that may be alternatives to the name given. For instance, “Jeffrey Dean” returns “Jeffrey Dean”, “Jefrey Dean”, “Dean Jeff” and “Jeff R Dean”. These variations all appear in public documents and may include misspellings, longer/short given names, inclusion/exclusion of middle name or initial and different word orders.

Each alias may be called separately to return related entities and publications associated with that name. In subsequent versions we will add the ability to include/exclude any of these aliases.

We also try to identify and score unique persons with identical names (the “John Smith” problem) using shared employers, shared agents, shared coworkers, skillsets, geolocation, time and other signals. In subsequent versions we will release

Companies

Companies includes companies or institutions such as universities with which this person has been affiliated. For instance, “Jeff Dean” returns “Google Inc” and “Stanford University”. These companies have appeared on public documents alongside the person. Company affiliations include but are not limited to:

  • - assignee of a patent
  • - owner of a trademark
  • - company being registered
  • - company under contract
  • - disclosed employer on campaign contributions
  • - disclosed relationship as manager, director or shareholder
  • - law firm advising on a filing

Companies are not always explicitly identified as being a company. We look at entity type, role on the filing and inclusion of certain keywords in the name to determine if it is a company. These are not always correct.

Each company may then be looked up using the Company API.

People

People includes people with whom this person has been affiliated. For instance, “Jeffrey Dean” returns “Matt Cutts” and “Karl Pfleger”.

These people have appeared on public documents alongside the person. This means that it only includes people that are one degree separated from the person. It does not include second-degree relationships such as people affiliated with one or more of the affiliated companies.

Person affiliations include but are not limited to:

  • - co-inventors on a patent
  • - co-owners of a trademark
  • - fellow manager, director or shareholder
  • - lawyer advising on a filing

Specialties

Specialties include the most common technologies which with this person is affiliated. For instance, “Jeffrey Dean” returns “search”, “machine learning”, etc.

We identify and score unique skills by mining millions of publications and factoring in word frequency, patent classifications, inventors and other signals. This is a non-trivial problem and requires a large number of frequent but not useful terms to be filtered out.

Locations

Locations includes address with which this person has been affiliated. For instance, “Jeffrey Dean” returns “1600 Amphitheatre Pky, Mountain View, CA” and “Palo Alto, CA”.

These addresses have appeared on public documents as either the person’s explicit address or the primary address on the filing. This means that it only includes addresses that are one degree separated from the person. It does not include second-degree relationships such as addresses affiliated with one or more of the affiliated companies.

Most addresses have been normalized and deduplicated using longitude and latitude. In many instances the address will not include details down to the street level. In these cases it will only include city and country.

Publications

Publications are all those documents on which the person appears. The full list of document types appears here.

Examples

  1. http://api.seravia.com/v1/people?name=Jeffrey%20Dean&companies=(Google%20Inc,Stanford%20University)
  2. http://api.seravia.com/v1/people?name=Dean%20Kamen
  3. http://api.seravia.com/v1/people?name=Nathan%20Myhrvold
  4. http://api.seravia.com/v1/people?name=Jonathan%20Ive

Datasources

In the first version this information comes largely from ~55 million worldwide patents. Scholarly journals will be added in subsequent versions.

Obtaining a Key

If you are interested in advance access, please contact us at [email protected].

Capture

We investigate and acquire datasets from around the world using automated data retrieval and "deep web" crawling methods.

Clean

Our analysts use our existing library of proprietary ETL tools to return clean and structured data in any format your business requires.

Cluster

Raw data is rarely good enough by itself. We use advanced natural language and machine learning methods to extract entities, from people to addresses, so you can find information, not just data.

Content

At over 200 million records, 1 billion entities and relationships and 88 countries, our existing dataset of companies, people, intellectual property, legal and financial filings make for powerful supplements to your existing data.

Contact

Sales | Support | Press | Employment | Partners

US: +1 302-566-5993
Hong Kong: +852 3693-1524
Fax: +1 866-594-4383