//

Neptune.ai full revamp

Company:

Neptune.ai

Year:

2024-2025

Overview

Redesigning Neptune.ai in six months to make complex AI workflows clear, usable, and fast — helping the team win over OpenAI and elevate the product experience.

Context & My role

I joined Neptune.ai as a Senior Product Designer at a pivotal moment. The company was six months away from finalizing a major one-year contract with OpenAI. To secure the deal, Neptune had to adapt the product to OpenAI’s highly specific workflows, and fast.

The product’s backend performance was exceptional, but the UX and UI significantly lagged behind competitors. Teams at OpenAI were tracking experiments in Neptune, but switching to Weights & Biases whenever they needed to share or present results. This was a major risk.

The design team had just three people; within weeks, it became two: our Head of Design and me.


Together, we were responsible for:

  • modernizing the core UX and UI

  • rebuilding the design system

  • validating solutions directly with OpenAI and other key customers

  • supporting engineers under an extremely compressed timeline

The challenge

The product experience had three critical issues:

  1. High complexity, low clarity: the interface wasn’t designed for data-dense AI workflows.

  2. Fragmentation: inconsistent UI patterns made experimentation and reporting slow.

  3. Missing trust-building UX: OpenAI needed reliability, readability, and control, not visual chaos.


On top of that:

  • Most feature requests came from engineers as technical instructions, not problems to solve.

  • I was new to the ML research domain, so I had to learn fast while delivering improvements.

  • We had to meet the needs of very different personas: ML researchers, ML engineers, DS managers, and business stakeholders, each with different workflows and expectations.

Approach & Process

Understanding the users

I led a rapid product audit and, together with the Head of Design, defined four core personas and mapped their key flows:

  • ML Researchers (experiment comparison, metric analysis)

  • ML Engineers (system metrics, reproducibility)

  • Data Science Managers (monitoring progress, sharing results)

  • Business Stakeholders (reports, KPIs, clarity)

This allowed us to identify the high-traffic, high-impact areas to prioritize:

  • All Runs

  • Charts

  • Run Details

  • Side-by-side comparison

  • Dashboards & Reports

Establishing design leadership

We shifted from “design as UI output” to:

  • identifying real problems

  • crafting solutions backed by user feedback

  • designing scalable systems

  • validating directly with OpenAI and other strategic customers


Design system foundation

I co-built Neptune’s internal design system, which:

  • standardized interactions across dashboards, charts, and reports

  • enabled engineers to ship consistently

  • cut delivery time for new components and pages

What I worked on

Below are highlights of my product thinking on some of the most impactful work.

Core product modernizations

Across six months I redesigned:

  • Charts (layout, global/local settings, pinned legend)

  • Dashboards (new components, empty states, export to image)

  • Reports (templates, navigation, modes, versioning, auto sections)

  • Single Run Page (metadata hierarchy, artifacts organization)

  • Side-by-Side comparison (clarity on differences, improved usability)


What I Delivered

  • A modernized UI across the entire experiment tracking experience

  • A new design system with documented components

  • Redesigned Reports, Dashboards, Charts, and Run Details

  • UX improvements validated with OpenAI key users

  • Scalable patterns for long-text handling and data grouping

  • Faster team workflows through reusable templates

Example deep dive: Long metric names

Problem:

Users struggled to read, compare, and scan very long metric names (e.g., train/v1/phase/model_loss/step_early).

The structure was highly repetitive, differences were subtle, and readability broke down in charts, dropdowns, and reports.

Constraints:

  • I had no access to real metric names due to privacy policies.

  • Naming conventions varied widely between customers and even across teams.

  • Solutions needed to work for OpenAI, Poolside, and future enterprise clients.

  • Some customers preferred UI configuration; others preferred code-based control.


Approach: Designing a Flexible Solution for Unpredictable Data

I explored several directions, all focused on making metrics easier to read at scale.

Concept exploration: Metric manager (NOT built)

I explored a project-level management tool enabling users to:

  • group similar metrics

  • rename metrics in bulk (e.g., regex)

  • define how names should be shortened (start/middle/end)

  • hide noise in naming patterns


Why it wasn’t built:

  • OpenAI users found project-level renaming too rigid for their workflow

  • They preferred adjustments closer to the reporting layer

  • Potential UI for renaming risked being slower than editing in code

  • Maintaining consistency across large teams would be difficult


Value:

Even though this solution wasn’t implemented, it clarified what different personas needed and informed the eventual direction.

Practical, shippable solution: Smart aggregation logic (built)

Some customers had metrics that were static values, making aggregation options (AVG, VARIANCE, etc.) irrelevant.

I designed and shipped a lightweight rules-based system:

  • If the metric has one value → hide aggregation options

  • If it has multiple → surface the full list


Impact:

  • Reduced dropdown noise

  • Improved clarity in metric selection

  • Quick win requested by Poolside, validating responsiveness to user feedback

Dynamic section generation for reports (explored → partially used in other work)

Large reports became unreadable when they contained dozens of metrics.

I explored an automated system that:

  • detects shared prefixes

  • groups related metrics

  • collapses sections like train/v1, train/v2, eval

  • persists collapsed states across sessions or shared links


Feedback:

  • Poolside: “Extremely useful” for organizing complex experiments

  • OpenAI: This didn’t address their primary pain points, but the idea influenced broader grouping logic in reports

  • Although this wasn’t shipped as a standalone feature, parts of the solution informed improvements in the overall reporting experience.

The implemented direction: Highlighting differences in metric names (built + launched)

The final shipped solution focused on the simplest, most universal problem to solve:

Users needed to instantly see what’s different between two long names.


What I designed:

  • Automatic detection of common prefixes

  • Collapsing identical sections with ellipses

  • Highlighting only the differing parts

  • A consistent, minimal visual pattern that also works in:

  • charts

  • dropdowns

  • tooltips

  • tables

  • report widgets


Validation with strategic users:

  • OpenAI: Very positive — made comparison dramatically faster

  • Poolside: “Finally readable”

  • Internal teams adopted this pattern for long experiment and run names too

Impact

Outcome:

  • Clearer, denser, more readable charts

  • Reduced visual clutter across multiple product areas

  • No regressions in export, responsive layouts, or linked states


Impact Summary

  • Improved scanning and comparison speed in core workflows

  • Reduced cognitive load for researchers analyzing dozens of metrics

  • A reusable pattern for long-text UI across the product

  • Positive sentiment from OpenAI and other enterprise customers

This project showed that even small usability improvements can meaningfully improve productivity in highly technical, high-cognitive-load environments.

Overall impact & Outcomes

Business Impact

  • OpenAI signed a one-year contract with Neptune

  • Their largest teams fully migrated to Neptune

  • For the first time, users stayed in Neptune for reports, instead of switching to Weights & Biases

  • Product quality improved to the point where Neptune positioned ahead of competitors in UX/UI


User Impact

  • Major reduction in visual clutter

  • Faster metric comparison and interpretation

  • Clearer reporting and collaboration workflows

  • More consistent and reliable UI across the product


Team & Process Impact

  • Engineering adopted design system components for consistent execution

  • Design gained influence and clearer problem ownership

  • Faster design-to-development velocity


Reflections

This project taught me that:

  • Good design can win enterprise-level deals — even in highly technical products.

  • Systems thinking scales better than one-off features.

  • Talking to real users — even a handful — dramatically increases clarity and confidence.

  • Design leadership is often about framing the problem correctly in engineering-driven environments.

The biggest win: shifting Neptune from “we just need someone to make screens” → to design as a strategic partnerthat shaped the product’s future and helped land its biggest customer.

Design stories

Redesigning Neptune.ai in six months to make complex AI workflows clear, usable, and fast - helping the team win over OpenAI ad elevate the product experience.

Redesigning adoption and onboarding at Capitalise.com: Flexi Plan & Client Onboarding simplified financial tools for small accounting firms and empowered SMEs with actionable credit insights.

This project was about making life easier for designers and keeping the brand consistent. Instead of everyone spending time recreating buttons, icons, or layouts from scratch, we built a central library of reusable elements and clear guidelines.

Design stories

Redesigning Neptune.ai in six months to make complex AI workflows clear, usable, and fast - helping the team win over OpenAI ad elevate the product experience.

Flexi plan & Client onboarding

Redesigning adoption and onboarding at Capitalise.com: Flexi Plan & Client Onboarding simplified financial tools for small accounting firms and empowered SMEs with actionable credit insights.

Marketing UI kit

This project was about making life easier for designers and keeping the brand consistent. Instead of everyone spending time recreating buttons, icons, or layouts from scratch, we built a central library of reusable elements and clear guidelines.

Design stories

Redesigning Neptune.ai in six months to make complex AI workflows clear, usable, and fast - helping the team win over OpenAI ad elevate the product experience.

Flexi plan & Client onboarding

Redesigning adoption and onboarding at Capitalise.com: Flexi Plan & Client Onboarding simplified financial tools for small accounting firms and empowered SMEs with actionable credit insights.

Marketing UI kit

This project was about making life easier for designers and keeping the brand consistent. Instead of everyone spending time recreating buttons, icons, or layouts from scratch, we built a central library of reusable elements and clear guidelines.

Let’s Connect

Let’s Connect

Let’s
Connect