Platform

AI

AI Agents
Sense, decide, and act faster than ever before
AI Visibility
See how your brand shows up in AI search
AI Feedback
Distill what your customers say they want
Amplitude MCP
Insights from the comfort of your favorite AI tool

Insights

Product Analytics
Understand the full user journey
Marketing Analytics
Get the metrics you need with one line of code
Session Replay
Visualize sessions based on events in your product
Heatmaps
Visualize clicks, scrolls, and engagement

Action

Guides and Surveys
Guide your users and collect feedback
Feature Experimentation
Innovate with personalized product experiences
Web Experimentation
Drive conversion with A/B testing powered by data
Feature Management
Build fast, target easily, and learn as you ship
Activation
Unite data across teams

Data

Warehouse-native Amplitude
Unlock insights from your data warehouse
Data Governance
Complete data you can trust
Security & Privacy
Keep your data secure and compliant
Integrations
Connect Amplitude to hundreds of partners
Solutions
Solutions that drive business results
Deliver customer value and drive business outcomes
Amplitude Solutions →

Industry

Financial Services
Personalize the banking experience
B2B
Maximize product adoption
Media
Identify impactful content
Healthcare
Simplify the digital healthcare experience
Ecommerce
Optimize for transactions

Use Case

Acquisition
Get users hooked from day one
Retention
Understand your customers like no one else
Monetization
Turn behavior into business

Team

Product
Fuel faster growth
Data
Make trusted data accessible
Engineering
Ship faster, learn more
Marketing
Build customers for life
Executive
Power decisions, shape the future

Size

Startups
Free analytics tools for startups
Enterprise
Advanced analytics for scaling businesses
Resources

Learn

Blog
Thought leadership from industry experts
Resource Library
Expertise to guide your growth
Compare
See how we stack up against the competition
Glossary
Learn about analytics, product, and technical terms
Explore Hub
Detailed guides on product and web analytics

Connect

Community
Connect with peers in product analytics
Events
Register for live or virtual events
Customers
Discover why customers love Amplitude
Partners
Accelerate business value through our ecosystem

Support & Services

Customer Help Center
All support resources in one place: policies, customer portal, and request forms
Developer Hub
Integrate and instrument Amplitude
Academy & Training
Become an Amplitude pro
Professional Services
Drive business success with expert guidance and support
Product Updates
See what's new from Amplitude

Tools

Benchmarks
Understand how your product compares
Templates
Kickstart your analysis with custom dashboard templates
Tracking Guides
Learn how to track events and metrics with Amplitude
Maturity Model
Learn more about our digital experience maturity model
Pricing
LoginContact salesGet started

AI

AI AgentsAI VisibilityAI FeedbackAmplitude MCP

Insights

Product AnalyticsMarketing AnalyticsSession ReplayHeatmaps

Action

Guides and SurveysFeature ExperimentationWeb ExperimentationFeature ManagementActivation

Data

Warehouse-native AmplitudeData GovernanceSecurity & PrivacyIntegrations
Amplitude Solutions →

Industry

Financial ServicesB2BMediaHealthcareEcommerce

Use Case

AcquisitionRetentionMonetization

Team

ProductDataEngineeringMarketingExecutive

Size

StartupsEnterprise

Learn

BlogResource LibraryCompareGlossaryExplore Hub

Connect

CommunityEventsCustomersPartners

Support & Services

Customer Help CenterDeveloper HubAcademy & TrainingProfessional ServicesProduct Updates

Tools

BenchmarksTemplatesTracking GuidesMaturity Model
LoginSign Up

Handling Missing Data Using Multiresolution Tensor Completion

Learn how the Data Science and Machine Learning team at Amplitude is leading the way on new industry solutions for handling missing data.
Insights

Aug 9, 2021

7 min read

Cao (Danica) Xiao

Cao (Danica) Xiao

Former Senior Director, Data Science & Machine Learning, Amplitude

Multiresolution tensor completion

Longitudinal user behavioral data are collected to track users’ interaction with digital products or information systems at different points in time. It is ubiquitous in a wide range of digital businesses. However, when it comes to leveraging that data, problems with missing data may prove just as common.

Due to various reasons such as poor data onboarding and unreliable data sources, many businesses lose data. This missing data naturally results in significant challenges to providing accurate insights. Moreover, these issues often present complex missing data patterns, which adds to the difficulty of handling missing data.

At Amplitude, our digital optimization system helps companies track longitudinal user activity on digital products, which is used to generate insights about product optimization, user engagement, churn prevention, and more. To ensure trustworthy and accurate results for all those tasks, it has been a long-standing mission of our machine learning team to effectively handle missing data patterns.

In this upcoming paper to be published on The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’ 21), we collaborated with researchers from the University of Illinois at Urbana-Champaign (UIUC) to propose a new machine learning method for solving this long-standing problem of missing data in data analysis. Particularly, we proposed a multiresolution tensor completion method for handling missing data patterns in our event-based user behavioral data.

Tensor completion is a classical data imputation technique for multi-dimensional data. For example, given a 3D tensor of user-product-time, the tensor element (x, y, z) corresponds to a binary number indicating whether user x bought product y at month z. Often, there is a large portion of missing data in an input tensor. Tensor completion aims at estimating the missing elements in the input tensor. In our proposed multiresolution tensor completion method (abbreviated to “MTC”), we tackle two missing data patterns for achieving more accurate tensor completion:

  • Partial observation: Only a small subset of data elements exist in the input data tensor. For example, we only observe a small percentage of user-product relevance scores based on limited historical user transactions, while the vast majority of the user-product scores are unknown.
  • Coarse observation: Some tensor dimensions only have coarse and aggregated patterns (e.g., monthly summary instead of daily reports).

The specific testing bed used in the paper is about healthcare product analytics: spatio-temporal disease and healthcare demand prediction using historical observation of disease counts at specific locations and time points. In particular, a fine-granular observation tensor is constructed as a 3D tensor of disease code-by-zip code-by date, while two aggregated 2D tensors (i.e., two matrices) are also present: (1) disease categories (coarse-level diseases) by county, and (2) disease categories by week. The goal is to accurately estimate all the entries in the fine-granular 3D tensor.

Our MTC Method

Amplitude's blog image

The proposed method is called Multiresolution Tensor Completion (MTC), which follows a multiresolution recursive algorithmic flow.

To handle missing data patterns with the MTC method, we first apply subsampling on all accessible information (i.e., the tensors and the known aggregation matrices) into the lowest resolution. Different subsampling strategies are proposed depending on the data type. For continuous dimensions such as time, regular sampling is used, while for categorical dimensions the bias sampling is applied to focus on the feature dimensions of large values or denser observation. For example, in the spatio-temporal disease tensor presented in the paper, time is sampled with regular intervals (e.g., every t observations), and disease dimensions are sampled to keep common diseases.

Next, we solve the low-resolution problem by applying the tensor optimization solver. In the paper, we propose a constraint-alternating least-square approach to efficiently solve the optimization problem.

Finally, we interpolate the solution into the higher resolution to initialize the high-resolution factors. We repeat this process and find a good initialization for the original fine-granular problem.

Results

In our KDD paper, MTC was evaluated on real-world spatio-temporal demand prediction scenarios with a particular healthcare industry setting. The experiments are conducted to predict future COVID cases for each location in the United States through mining the longitudinal public health data generated during the interactions of patients and healthcare systems.

We evaluated our MTC algorithm against leading tensor completion baselines including Block Gradient Descent (BGD), B-PREMA and CMTF-OPT on the following accuracy and efficiency metrics, such as Percent of Fit (PoF), CPU time, and peak memory usage. MTC outperforms all baselines by a great margin on PoF and CPU time while having about the same low space complexity, which shows great promises and efficacy for its deployment in production.

Leading the Way on Handling Missing Data

At Amplitude, we strive to help all our customers obtain user and product insights that are trustworthy and consistent. The proposed MTC approach is one of the efforts we have made toward delivering our mission, as it can easily translate into a general solution that powers large user-product interaction data from all industries. Generally speaking, we can create an input tensor of user-product-time, and aggregate tensors of user-product and user-time to apply the proposed method for accurately estimating every element in the user-product-time tensor. The downstream applications of such a task are directly related to our key products—Amplitude Analytics, Amplitude Recommend, and Amplitude Experiment—which help forecast future user behaviors or recommend what content to show to the users at any given moment during the user interaction journey, or help impute user data to reduce bias in experimentation.

This work is just one of the many ways our team is leading the way on data analysis in digital business. Interested in getting involved? Check out our careers page to learn more.


This work is in collaboration with Professors Sun and Solomonik at University of Illinois Urbana-Champaign and with industry collaborators at IQVIA. A preprint version of the full paper can be found at this link and will be presented at the KDD conference from Aug 14-18, 2021.

About the author
Cao (Danica) Xiao

Cao (Danica) Xiao

Former Senior Director, Data Science & Machine Learning, Amplitude

More from Cao

Cao (Danica) Xiao is a former senior director and head of data science and machine learning at Amplitude. She is a passionate machine learning researcher and is currently leading Amplitude's machine learning team in building machine learning solutions for Amplitude products.

More from Cao
Topics

Machine Learning

Platform
  • Product Analytics
  • Feature Experimentation
  • Feature Management
  • Web Analytics
  • Web Experimentation
  • Session Replay
  • Activation
  • Guides and Surveys
  • AI Agents
  • AI Visibility
  • AI Feedback
  • Amplitude MCP
Compare us
  • Adobe
  • Google Analytics
  • Mixpanel
  • Heap
  • Optimizely
  • Fullstory
  • Pendo
Resources
  • Resource Library
  • Blog
  • Product Updates
  • Amp Champs
  • Amplitude Academy
  • Events
  • Glossary
Partners & Support
  • Contact Us
  • Customer Help Center
  • Community
  • Developer Docs
  • Find a Partner
  • Become an affiliate
Company
  • About Us
  • Careers
  • Press & News
  • Investor Relations
  • Diversity, Equity & Inclusion
Terms of ServicePrivacy NoticeAcceptable Use PolicyLegal
EnglishJapanese (日本語)Korean (한국어)Español (Spain)Português (Brasil)Português (Portugal)FrançaisDeutsch
© 2025 Amplitude, Inc. All rights reserved. Amplitude is a registered trademark of Amplitude, Inc.

Recommended Reading

article card image
Read 
Product
Getting Started: Product Analytics Isn’t Just for Analysts

Dec 5, 2025

5 min read

article card image
Read 
Insights
Vibe Check Part 3: When Vibe Marketing Goes Off the Rails

Dec 4, 2025

8 min read

article card image
Read 
Customers
How CAFU Tripled Engagement and Boosted Conversions 20%+

Dec 4, 2025

8 min read

article card image
Read 
Customers
The Future is Data-Driven: Introducing the Winners of the Ampy Awards 2025

Dec 2, 2025

6 min read

Explore Related Content

Integration
Using Behavioral Analytics for Growth with the Amplitude App on HubSpot

Jun 17, 2024

10 min read

Personalization
Identity Resolution: The Secret to a 360-Degree Customer View

Feb 16, 2024

10 min read

Product
Inside Warehouse-native Amplitude: A Technical Deep Dive

Jun 27, 2023

15 min read

Guide
5 Proven Strategies to Boost Customer Engagement

Jul 12, 2023

Video
Designing High-Impact Experiments

May 13, 2024

Startup
9 Direct-to-consumer Marketing Tactics to Accelerate Ecommerce Growth

Feb 20, 2024

10 min read

Growth
Leveraging Analytics to Achieve Product-Market Fit

Jul 20, 2023

10 min read

Product
iFood Serves Up 54% More Checkouts with Error Message Makeover

Oct 7, 2024

9 min read

Blog
InsightsProductCompanyCustomers
Topics

101

AI

APJ

Acquisition

Adobe Analytics

Amplify

Amplitude Academy

Amplitude Activation

Amplitude Analytics

Amplitude Audiences

Amplitude Community

Amplitude Feature Experimentation

Amplitude Guides and Surveys

Amplitude Heatmaps

Amplitude Made Easy

Amplitude Session Replay

Amplitude Web Experimentation

Amplitude on Amplitude

Analytics

B2B SaaS

Behavioral Analytics

Benchmarks

Churn Analysis

Cohort Analysis

Collaboration

Consolidation

Conversion

Customer Experience

Customer Lifetime Value

DEI

Data

Data Governance

Data Management

Data Tables

Digital Experience Maturity

Digital Native

Digital Transformer

EMEA

Ecommerce

Employee Resource Group

Engagement

Event Tracking

Experimentation

Feature Adoption

Financial Services

Funnel Analysis

Getting Started

Google Analytics

Growth

Healthcare

How I Amplitude

Implementation

Integration

LATAM

Life at Amplitude

MCP

Machine Learning

Marketing Analytics

Media and Entertainment

Metrics

Modern Data Series

Monetization

Next Gen Builders

North Star Metric

Partnerships

Personalization

Pioneer Awards

Privacy

Product 50

Product Analytics

Product Design

Product Management

Product Releases

Product Strategy

Product-Led Growth

Recap

Retention

Startup

Tech Stack

The Ampys

Warehouse-native Amplitude