Atlas Data Federation supports publicly accessible URLs as federated database instance stores. You must define mappings in your federated database instance to your HTTP data stores to run queries against your data.
Important
Information in your storage configuration is visible internally at MongoDB and stored as operational data to monitor and improve the performance of Atlas Data Federation. So, we recommend that you do not use PII in your configurations.
Example Configuration for HTTP Data Store
Consider URLs https://www.datacenter-hardware.com/data.json
,
https://www.datacenter-software.com/data.json
, and
https://www.datacenter-metrics.com/data.json
containing data
collected from a datacenter. The following configuration:
Specifies the publicly accessible URLs that contain data in files as a federated database instance store.
Creates a partition for each URL.
{ "stores" : [ { "name" : "httpStore", "provider" : "http", "allowInsecure" : false, "urls" : [ "https://www.datacenter-hardware.com/data.json", "https://www.datacenter-software.com/data.json" ], "defaultFormat" : ".json" } ], "databases" : [ { "name" : "dataCenter", "collections" : [ { "name" : "inventory", "dataSources" : [ { "storeName" : "httpStore", "allowInsecure" : false, "urls" : [ "https://www.datacenter-metrics.com/data" ], "defaultFormat" : ".json" } ] } ] } ] }
Configuration Format
The federated database instance configuration has the following format:
1 { 2 "stores" : [ 3 { 4 "name" : "<string>", 5 "provider": "<string>", 6 "defaultFormat" : "<string>", 7 "allowInsecure": <boolean>, 8 "urls": ["<string>"] 9 } 10 ], 11 "databases" : [ 12 { 13 "name" : "<string>", 14 "collections" : [ 15 { 16 "name" : "<string>", 17 "dataSources" : [ 18 { 19 "storeName" : "<string>", 20 "allowInsecure" : <boolean>, 21 "urls" : ["<string>"], 22 "defaultFormat" : "<string>", 23 "provenanceFieldName": "<string>" 24 } 25 ] 26 } 27 ], 28 "views" : [ 29 { 30 "name" : "<string>", 31 "source" : "<string>", 32 "pipeline" : "<string>" 33 } 34 ] 35 } 36 ] 37 }
stores
- The
stores
object defines each data store associated with the federated database instance. The federated database instance store captures files stored at publicly accessible URLs. Data Federation can only access data stores defined in thestores
object. databases
- The
databases
object defines the mapping between each federated database instance store defined instores
and MongoDB collections in the databases.
stores
1 "stores" : [ 2 { 3 "name" : "<string>", 4 "provider" : "<string>", 5 "allowInsecure": <boolean>, 6 "urls" : ["<string>"], 7 "defaultFormat" : "<string>" 8 } 9 ]
Field | Type | Necessity | Description | |
---|---|---|---|---|
array | required | Array of objects where each object represents a data store to
associate with the federated database instance. The federated database instance store captures files stored at
publicly accessible URLs. Atlas Data Federation can only access data stores
defined in the | ||
string | required | Name of the federated database instance store. The
| ||
string | required |
| ||
boolean | required | Optional. Validates the scheme in the specified URLs. Value can be one of the following:
If true, Atlas Data Federation:
WARNING: If you set this to If omitted, defaults to
| ||
array | optional | Comma-separated list of publicly accessible HTTP URLs where data is stored. You can't specify URLs that require authentication. | ||
string | optional | Default format that Data Federation assumes
if it encounters a file without an extension while searching the
The following values are valid for the
IMPORTANT: If your file format is If omitted, Data Federation attempts to detect the file type by processing a few bytes of the file. The specified format only applies to the URLs specified in the
Tip |
databases
1 "databases" : [ 2 { 3 "name" : "<string>", 4 "collections" : [ 5 { 6 "name" : "<string>", 7 "dataSources" : [ 8 { 9 "storeName" : "<string>", 10 "allowInsecure" : <boolean>, 11 "urls" : ["<string>"], 12 "defaultFormat" : "<string>", 13 "provenanceFieldName": "<string>" 14 } 15 ] 16 } 17 ] 18 } 19 ]
Field | Type | Necessity | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
array | required | Array of objects where each object represents a database, its
collections, and, optionally, any views on
the collections. Each database can have multiple | |||||||
string | required | Name of the database to which Atlas Data Federation maps the data contained in the data store. | |||||||
array | required | Array of objects where each object represents a collection and data
sources that map to a | |||||||
string | required | Name of the collection to which Atlas Data Federation maps the data contained in
each ImportantWildcard * collections are not available for HTTP (HyperText Transport Protocol) stores. | |||||||
array | string | Array of objects where each object represents a
| |||||||
string | required | ||||||||
boolean | required | Optional. Validates the scheme in the specified URLs. Value can be one of the following:
If true, Atlas Data Federation:
WARNING: If you set this to If omitted, defaults to
| |||||||
array | optional | Comma-separated list of publicly accessible URLs
where the data is stored. Federated Database Instance creates a partition for each
URL. You can specify URLs that are not in the
If omitted, Data Federation uses the | |||||||
string | optional | Default format that Data Federation assumes
if it encounters a file without an extension while searching the
The following values are valid for the
IMPORTANT: If your file format is If omitted, Data Federation attempts to detect the file type by processing a few bytes of the file. The specified format only applies to the URLs specified in the
Tip | |||||||
string | required | Name for the field that includes the provenance of the documents in the results. If you specify this setting in the storage configuration, Atlas Data Federation returns the following fields for each document in the result:
You can't configure this setting using the Visual Editor in the Atlas UI. | |||||||
array | required | Array of objects where each object represents an aggregation pipeline on a collection. To learn more about views, see Views. | |||||||
string | required | Name of the view. | |||||||
string | required | Name of the source collection for the view. If you want to create a view with a $sql stage, you must omit this field as the SQL statement will specify the source collection. | |||||||
array | required | Aggregation pipeline stage(s) to apply to the
|
Example Configuration for HTTP Data Store
Consider URLs https://www.datacenter-hardware.com/data.json
,
https://www.datacenter-software.com/data.json
, and
https://www.datacenter-metrics.com/data.json
containing data
collected from a datacenter. The following configuration:
Specifies the publicly accessible URLs that contain data in files as a federated database instance store.
Creates a partition for each URL.
{ "stores" : [ { "name" : "httpStore", "provider" : "http", "allowInsecure" : false, "urls" : [ "https://www.datacenter-hardware.com/data.json", "https://www.datacenter-software.com/data.json" ], "defaultFormat" : ".json" } ], "databases" : [ { "name" : "dataCenter", "collections" : [ { "name" : "inventory", "dataSources" : [ { "storeName" : "httpStore", "allowInsecure" : false, "urls" : [ "https://www.datacenter-metrics.com/data" ], "defaultFormat" : ".json" } ] } ] } ] }