Announcing Featureform 0.9!

June 5, 2023

Hi Folks! We're excited to announce V0.9 of Featureform, one of our biggest releases to date! It includes:

  • Redis Vector Store & Embeddings Support: Featureform 0.9 introduces the ability to use Redis as a Vector Store, alongside the capability to generate embeddings via data pipelines using our Python API.
  • API & Data Interaction Improvements: Our API now supports interacting with training data sets as dataframes, in addition to interacting with sources as dataframes locally.
  • Expanded Compatibility and Increased Stability: This version brings support for Pandas on S3 and Kubernetes, opening up more opportunities for integration. In addition, we've significantly increased the stability of Featureform's scheduler, ensuring a seamless and consistent experience for our users.

We also want to give a big shout out to our community members and our customers for their ongoing support and feedback. We look forward to hearing your thoughts on v0.9. Feel free to drop us a line in our Slack community.

Vector Database and Embedding Support

You can use Featureform to define and orchestrate data pipelines that generate embeddings. Featureform can write them into either Redis for nearest neighbor lookup. This also allows users to version, re-use, and manage embeddings declaratively. Learn more about Redis' Vector Store and Vector Similarity Search here.

Registering Redis for use as a Vector Store

You can register Redis as a vector store using the same method as you would for a non-Vector Store provider.

An example of Redis as a Vector Store Provider

A Pipeline to Generate Embeddings from Text

Featureform now allows you to generate embeddings from text using OpenAI's Embeddings. Simply add your

Defining and Versioning an Embedding

You can store your embedding definitions and version them with a Featureform variant as you iterate. Use the "embed_docs" tuple to specify the entity ID and the column name in index 0 and 1, respectively, and the "variant" parameter to specify a version.

This example creates "sentence_embedding"

Performing a Nearest Neighbor Lookup

Interact with Training Sets as Dataframes

v0.8 Recap:

In v0.8, we added the ability to interact with sources as dataframes. Prior to v0.8, users couldn't experiment with data they had previously registered with a provider. Instead of registering transformations and working with them during experimentation, they had to wait until they were production ready to register them with Featureform. Our new experimental api removes this workflow disruption and allows data scientists to serve data directly from sources registered on providers. This eliminates the need for separate connections or additional database client libraries, streamlining the data experimentation process and allowing users to register transformations more efficiently.

We're excited to extend the same functionality to training sets as well! In this example, you can specify the name ("fraud") of the training set and variant (ex. "simple").

Enhanced Scheduling across Offline Stores

Featureform supports Cron syntax for scheduling transformations to run. This release rebuffs this functionality to make it more stable and efficient, and also adds more verbose error messages.

A transformation that runs every hour on Snowflake

Run Pandas Transformations on K8s with S3 

Featureform schedules and runs your transformations for you. We support running Pandas directly, Featureform spins up a Kubernetes job to run it. This isn’t a replacement for distributed processing frameworks like Spark (which we also support), but it’s a great option for teams that are already using Pandas for production.

Defining our Pandas on Kubernetes Provider

Registering a file in S3 and a Transformation on it

_________________________________________________________________________________________________________________________

Interested in learning more about v0.9 of Featureform or looking for access control and governance capabilities? Book a demo of the Featureform platform here!

Related Reading

From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.

even more resources
blue arrow pointing left
blue arrow pointing right