SDK & CLI#
Tip
Our Python SDK is open source! Check out github.com/roboto-ai/roboto-python-sdk.
Overview#
The Python SDK and CLI can both be used to programmatically interact with Roboto.
The CLI is convenient for quickly creating new datasets, uploading or downloading files, and running actions. The Python SDK has comprehensive support for all Roboto platform features and is especially useful for data analysis and integration with your other tools.
See reference documentation for the Python SDK and the CLI.
Install#
Refer to the latest installation instructions available on GitHub.
CLI#
With the Python SDK, or standalone CLI installed, you can use roboto on the command line.
The example below shows how to create a new dataset and upload a file to it.
$ roboto datasets create --tag boston
{
"administrator": "Roboto",
"created": "2024-09-25T22:22:48.271387Z",
"created_by": "benji@roboto.ai",
"dataset_id": "ds_9ggdi910gntp",
...
"tags": [
"boston",
]
}
$ roboto datasets upload-files -d ds_9ggdi910gntp -p scene57.bag
100.0%|█████████████████████████ | 58.9M/58.9M | 2.62MB/s | 00:23 | Src: 1 file
Python SDK#
With the Python SDK installed, you can import roboto into your Python runtime.
The example below shows how to access topic data for an ingested ROS bag file:
from roboto import Dataset
ds = Dataset.from_id("ds_9ggdi910gntp")
bag = ds.get_file_by_path("scene57.bag")
steering_topic = bag.get_topic("/vehicle_monitor/steering")
steering_data = steering_topic.get_data(
start_time="1714513576", # "<sec>.<nsec>" since epoch
end_time="1714513590",
)
You can also create events:
from roboto import Event
Event.create(
start_time="1714513580", # "<sec>.<nsec>" since epoch
end_time="1714513590",
name="Fast Turn",
topic_ids=[steering_topic.topic_id]
)
Or even search for logs matching metadata and/or statistics with RoboQL:
from roboto import query, RobotoSearch
roboto_search = RobotoSearch(query.client.QueryClient())
query = '''
dataset.tags CONTAINS 'boston' AND
topics[0].msgpaths[/vehicle_monitor/vehicle_speed.data].max > 20
'''
results = roboto_search.find_files(query)
Refer to our collection of notebooks for complete examples, and the SDK reference.
Handling Eventual Consistency#
Some Roboto API operations use eventually consistent reads for improved performance and scalability. This means that query results may not immediately reflect very recent writes (typically within 1 second).
Affected Operations#
The following SDK operations may exhibit eventual consistency:
Query operations:
RobotoSearch.find_files(),RobotoSearch.find_datasets(), and other search methodsQuery result fetching: Getting results or counts from submitted queries
You may encounter eventual consistency when:#
You create or update data and immediately query for it
You’re running automated tests that create data and then search for it
You’re building workflows that depend on reading data immediately after writing it
Retry Pattern#
If you need to ensure recently written data appears in query results, use a retry with a brief delay:
import time
from roboto import Dataset, RobotoSearch, query
# Create a dataset
ds = Dataset.create(tags=["my-new-tag"])
# Search for it with retry logic
roboto_search = RobotoSearch(query.client.QueryClient())
max_attempts = 5
delay_seconds = 1
for attempt in range(max_attempts):
results = roboto_search.find_datasets('tags CONTAINS "my-new-tag"')
if any(r.dataset_id == ds.dataset_id for r in results):
print(f"Found dataset after {attempt + 1} attempt(s)")
break
if attempt < max_attempts - 1:
time.sleep(delay_seconds)
else:
print("Dataset not found after retries")
For most use cases, a single retry after 1 second is sufficient, as replication lag is typically sub-second.
Best Practices#
Avoid tight loops: Don’t poll continuously without delays
Use reasonable timeouts: Most data replicates within 1-2 seconds
Design for eventual consistency: When possible, structure workflows to avoid immediate read-after-write dependencies
Strongly consistent alternatives: If you need guaranteed immediate consistency, use direct record lookups (e.g.,
Dataset.from_id()) instead of queries
For more details on which specific API endpoints use eventual consistency, see the REST API documentation.