Difference between revisions of "API"
m (→Registration) |
m (Changed protection level for "API" ([edit=sysop] (indefinite) [move=sysop] (indefinite))) |
(No difference)
|
Revision as of 09:11, 29 October 2013
This document contains the technical specification of the WOT API v0.4 and it's aimed at developers implementing software that uses WOT services.
Terms and conditions
Before you continue, notice that in addition to the Terms of Service, which apply to all WOT services, the API Terms of Service apply for using the API.
Registration
You need an API key to use the public API. In order to request a key, you need a WOT account. Once you have an account, you can request your API key from this page. After you have requested a key, you will see an API tab in your profile where you can update the information and access the key.
Note: any previous public APIs that did not require registration are deprecated and may have stricter rate limits applied to them. If your requests to these APIs return HTTP status code 429, you should migrate to public_link_json2.
Introduction
The WOT reputation system computes website reputations using ratings received from users and information from third-party sources. This section contains a brief introduction to some of the concepts you will need to know to develop applications that use reputation data.
Targets
Reputations are computed for websites, or targets, which are identified primarily by their DNS names. If a DNS name is not available for a target, an IPv4 or IPv6 address may be used instead. WOT also supports Internationalized Domain Names (IDN), which must be encoded to an ASCII representation as described in RFC 3490. For example:
ääkkönen.fi = xn--kknen-fraa0m.fi
Reputations
Reputations are measured for targets in several components. For each {target, component} pair, the system computes two values: a reputation estimate and the confidence in the reputation. Together, these indicate the amount of trust in the target in the given component.
Components
Reputation components are identified by numbers. These are the current components and their definitions:
Component identifier | Description | Example |
---|---|---|
0 | Trustworthiness | “How much do you trust this site?” |
4 | Child safety | “How suitable is this site for children?” |
Note: components 1 and 2 are deprecated. The API will continue returning data for these components to preserve compatibility, but the components should not be used in new applications. Those interested in the type of information that was included in the deprecated components are encouraged to look into categories below.
Reputation and confidence values
The reputation r ∊ {0, ..., 100} is the estimate of the collective trust for the target in the given component. The higher the value, the more the community trusts the website. These are the definitions for the reputation values:
Reputation value | Description | Symbol |
---|---|---|
≥ 80 | Excellent | |
≥ 60 | Good | |
≥ 40 | Unsatisfactory | |
≥ 20 | Poor | |
≥ 0 | Very poor |
WOT uses this visual representation to indicate reputations:
The confidence c ∊ {0, ..., 100} indicates the estimated reliability of the reputation r for the {target, component} pair. Again, the higher the value, the more reliable the system considers the reputation estimate.
You should use the confidence value to determine whether an action based on a poor reputation is warranted. For example, the WOT add-on requires a confidence value of ≥ 10 before it presents a warning about a website. Using a higher confidence threshold will result in fewer false positives, but also means your application will catch fewer poorly rated sites. You are encouraged to experiment with different confidence thresholds to see which suits your application the best.
Categories
In addition to reputations, the rating system also computes categories for websites based on votes from users and third parties. Category data aims to explain the reason behind a poor reputation, and you can use the information to more specifically determine what type of action to take when coming across a poorly rated site. The current categories are as follows:
Category group | Category identifier | Description |
---|---|---|
Negative | 101 | Malware or viruses |
102 | Poor customer experience | |
103 | Phishing | |
104 | Scam | |
105 | Potentially illegal | |
Questionable | 201 | Misleading claims or unethical |
202 | Privacy risks | |
203 | Suspicious | |
204 | Hate, discrimination | |
205 | Spam | |
206 | Potentially unwanted programs | |
207 | Ads / pop-ups | |
Neutral | 301 | Online tracking |
302 | Alternative or controversial medicine | |
303 | Opinions, religion, politics | |
304 | Other | |
Positive | 501 | Good site |
The following categories provide additional information about child safety:
Category group | Category identifier | Description |
---|---|---|
Negative | 401 | Adult content |
Questionable | 402 | Incidental nudity |
403 | Gruesome or shocking | |
Positive | 404 | Site for kids |
For each category, the reputation system also computes a confidence value c ∊ {0, ..., 100}, similarly to reputations. The higher the value, the more reliable the category assignment can be considered. If you use categories to determine the severity of a poor reputation, you may want to use a lower confidence threshold for the category data.
WOT uses this visual representation for category groups and their confidence:
Third-party blacklists
If a website is included in a third-party blacklist and it's possible that this blacklisting affects its reputation, the API will return information about the type of blacklist the site was found in, and when the site was last added there. Here are the current blacklist types:
Blacklist type | Description |
---|---|
malware | Site is blacklisted for hosting malware. |
phishing | Site is blacklisted for hosting a phishing page. |
scam | Site is blacklisted for hosting a scam (e.g. a rogue pharmacy). |
Note that if a site appears on multiple third-party blacklists of the same type, the latest time it was added to either one of them will be returned.
Requests
The API consists of a number of interfaces, all of which are called using normal HTTP GET requests to api.mywot.com and return a response in XML or JSON format if successful. HTTP status codes are used for returning error information and parameters are passed using standard URL conventions. The request format is as follows:
http://api.mywot.com/version/interface?param1=value1¶m2=value2
TLS encryption can be used with all interfaces if requests are made from a secure web page to the reputation API, for example.
Documentation: Reputation API
public_link_json2
The public_link_json2 API is used for requesting reputations for multiple targets.
Parameters
hosts |
A list of target names separated with a forward slash (“/”). For example, www.example.com/another.example.net/onemore.example.org/. The value must end with a slash and must include at most 100 target names. Note: the full request path must also be less than 8 KiB or it will be rejected. |
callback (optional) |
The name of the callback function for a response in the JSONP (JSON with Padding) format. |
key |
Your API key. |
Return codes
If the call is successful, the returned HTTP status code is 200. If a server-side error occurred, the status code is 500. If the request included an invalid API key or incorrect parameters, the status code is 403. If you have exceeded your daily request quota, the status code is 429.
Output
The API returns a reputations, categories, and third-party blacklist information in a JSON or a JSONP format, depending on whether the callback parameter is specified in the request. The format is as follows:
- The response object has one attribute for each target, named by the unchanged target name given in the hosts parameter.
- Each target object has a target attribute, which contains the normalized target name.
- Each target object also has zero or more component attributes with names ∊ {“0”, ...}.
- Each component attribute contains an array with {r, c} values for the reputation component. If the reputation for a component is not known, the corresponding attribute is omitted from the output.
- Each target object also may have a categories attribute, which contains one or more category identifier attributes and their confidence values.
- Each target object also may have a blacklists attribute, which contains one or more blacklist type attributes and a Unix timestamp when the site was last added to the third-party source of given type.
Example
Request:
http://api.mywot.com/0.4/public_link_json2?hosts=example.COM/www.EXAMPLE.NET/&callback=process&key=<your API key>
Response:
process({ "example.COM": { "target": "example.com", "0": [ 91, 53 ], "4": [ 93, 53 ], "categories": { "501": 71, "304": 37 } }, "www.EXAMPLE.NET": { "target": "example.net", "0": [ 9, 43 ], "categories": { "101": 54 }, "blacklists": { "malware": 1362123608 } } })