Easily manage your Machine Learning Features across your organization.

Forget costly rewrites of your database or limitations on the machine learning libraries you can use. As a virtual feature store, we integrate with your current infrastructure, giving you ultimate flexibility. Defining, sharing, training, and serving features has never been simpler.

Our workflow in action

Don’t rebuild what you’ve already created. Featureform sits atop your existing architecture, helping you turn it into a powerful feature store, accessible to anyone in your organization.

With Without
gears and data stack illustration
Build better features from both batch and streaming data

Your raw data is transformed into high quality features — the key to effectively trained models.

sharing illustration connecting the dots
Effortless sharing and discovery across your organization

Our virtual feature store platform lets you share features as first-class entities of the ML process.

parachute illustration
Serve and deploy your
production-ready features

Deploy the features you’ve defined quickly and confidently.

import featureform as ff

ff.register_kubernetes_executor(
  name = "demo-kube",
  host = "https://demo-kube-env.alias",
  client_certificate     = "~/.kube/client-cert.pem",
  client_key             = "~/.kube/client-key.pem",
  cluster_ca_certificate = "~/.kube/cluster-ca-cert.pem",
)

ff.register_spark_executor(
  name = "demo-spark",
  host = "https://demo-spark-env.alias",
)
import featureform as ff
@ff.materialize(
    name = "Non-free Sulfur Dioxide",
    description =  "Sulfur Dioxide that is trapped",
    version = 1,
    inputs = ["Wine Data:total_sulfur_dioxide", "Wine Data:free_sulfur_dioxide"],
    entity = "wine_id",
    executor = "demo-kube",
    exec_type = "python",
)
def nonfree_sulfur(total, free):
    return total-freeff.upload_file("./wine-quality.csv", split_columns=True)
ff.register_feature(nonfree_sulfur, entity="wine_id", version=1)
ff.register_feature(name="fixed_acidity", source="wine_data:fixed_acidity", entity="wine_id")
ff.register_label(name="quality", source="wine_data", field="quality", entity_values="_id")
ff.register_training_set(
    name="wine_training_set",
    label="quality",
    features="wine_features",
    sampling=ff.All(),
)
import featureform as ff
 
from sklearn.externals import joblib
from sklearn.ensemble import RandomForestClassifier

# Get the training set we created
train_set = ff.training_set("wine_training_set")

# Get the labels and features (load them in memory)
features = train_set.features().to_mat()
labels = train_set.labels().to_arr()
# Train a classifier
clf = RandomForestClassifier(max_depth=2, random_state=0)
clf.fit(features, labels)
# Write it to a file
model_file = "model.sklearn"
joblib.dump(clf, model_file)
import featureform as ff

from sklearn.externals import joblib

# Load the model from a file 
model_file = "model.sklearn"
 clf = joblib.load(clf, model_file)

# Load our features
features = ff.online_feature_set("wine_features")

# Predicting quality of wine_id 0
print(clf.predict(features["0"]))

Configuration to Production

Facilitating every step.

Featureform optimizes the entire feature development process, offering powerful functionality through our Declarative API and Feature Catalog.

check out our documentation

Diving into the subject

From overviews to niche applications and everything in between, explore current discussion and commentary on feature management.

explore our resources

An integrated approach

One of Featureform’s clearest advantages is compatibility with a wide range of
data infrastructure and ML platforms. Whether you use Amazon SageMaker, Databricks, or Kubeflow, we’re ready for you from day one.

Ready to get started?

See what a virtual feature store means for your organization.