Back to the articles

The Segments.ai Python SDK: A closer look

September 12th, 2023 - 2 min -
Avatar photo
-

In this blog post we will go over the design decisions and features of the Segments’ Python SDK.

Most of our users at Segments.ai use the platform to get labeled data to train a deep learning model. So most of them are pretty technical.

Being able to integrate with the Segments platform in a programmatic way makes the developer experience a lot smoother. In the next sections we’ll go over 5 main features of the Python SDK.

1. RESTful endpoints

The Python SDK is a light wrapper around our public API which communicates with the Segments.ai backend. We chose a REST API design instead of alternatives such as GraphQL because of its simplicity. The methods in our Python SDK closely mirror the REST endpoints.

The Python SDK is a light wrapper around our public API which communicates with the Segments.ai backend. We chose a REST API design instead of alternatives such as GraphQL because of its simplicity. The methods in our Python SDK closely mirror the REST endpoints.

Let’s look at the Sample endpoints as an example. The available methods in our Python SDK are self-explanatory:

  • get_sample(sample_uuid) → Get a single sample
  • get_samples(dataset_identifier) → Get a list of samples
  • add_sample(dataset_identifier, name, attributes) → Add a single sample
  • add_samples(dataset_identifier, samples) → Add multiple samples
  • update_sample(sample_uuid, name, attributes) → Update a sample
  • delete_sample(sample_uuid) → Delete a sample

For simplicity we only included the most important arguments here; you can find more details in the SDK reference. These methods correspond directly with the four HTTP verbs used to interact with REST APIs:

  • GET get_sample(), get_samples()
  • POSTadd_sample(), add_samples()
  • PUTupdate_sample()
  • DELETE delete_sample()

This simplicity makes for a very intuitive developer experience.

2. Data validation

In the first version of our Python SDK, our methods returned plain dictionaries.

This is not ideal because it’s hard to know what’s inside a dictionary – it can contain anything. To overcome this problem we revamped the Python SDK by adding type hints and Pydantic classes representing the resources in the backend: a dataset, sample, label, issue, etc.

For example, the get_sample() endpoint now returns a Sample object containing typed fields:

  • Sample:
    • uuid: str
    • name: str
    • attributes: Union[ImageSampleAttributes, …]
    • created_at: str
    • created_by: str
    • label: Label
    • issues: List[Issue]
    • etc.

This means that you’re now getting automatic data validation, as well as type hint auto-completion in your code editor.

3. Improved error handling

The first version of our Python SDK simply printed any error messages and returned them as strings. It served its purpose, but if you wanted to handle errors effectively, you had to dig through the string for certain keywords to identify the type of error.

In the current version of the Python SDK we’ve improved the error handling mechanism by throwing proper errors with cleaner and more accurate error messages.

Errors like the following can now be quickly identified and resolved:

  • Uploading an point cloud sample in an image dataset triggers a ValidationError thanks to the Pydantic data validation.
  • If you attempt to add a collaborator to a dataset for which you don’t have the appropriate permissions, an AuthorizationError is thrown.
  • When you create a dataset that already exists, the system will trigger an AlreadyExistsError
  • If you’re querying the Segments API too fast you get an APILimitError, with an error message indicating that you should rate limit your requests.

4. Auto-generated documentation

We leverage Sphinx and Furo to make sure the SDK docs are always up-to-date. These docs are automatically generated from the Python docstrings in our code, and hosted via ReadTheDocs.

5. Bonus: String enums for better auto-complete

We have replaced string literals with enums in the latest version of the Python SDK. This simple change makes it possible for development tools like VSCode to autocomplete your code and provide suggestions. This feature comes in handy when you are unsure about the options available to you.

Wrapping up

We have just explored five features we added to our Python SDK to make it more user-friendly: its RESTful design, enhanced data validation, improved error handling, automatic documentation, and string enums for autocomplete suggestions. These all contribute to a smooth and intuitive developer experience.

If you have any questions about the topics covered in this blog or need further clarifications, don’t hesitate to reach out to us at arnaud@segments.ai.

Share this article