Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC AI products #30297

Open
jamesefhawkins opened this issue Mar 21, 2025 · 19 comments
Open

EPIC AI products #30297

jamesefhawkins opened this issue Mar 21, 2025 · 19 comments
Labels
concept Ideas that need some shaping up still epic

Comments

@jamesefhawkins
Copy link
Collaborator

jamesefhawkins commented Mar 21, 2025

Context / Goal

AI editors now exist that do everything inside your code.
PostHog does everything outside the code.
Bring them together - and you have products that make themselves successful

Behold - the PostHog Editor.

Proposal

I propose we build everything into a VSCode extension.

I think there are 4 products within this:

  • Max AI chat, but in the extension - ask anything / do anything with PostHog
    • we need to build manifest.tsx - so make clear what each product can export to max , so that the individual teams can make their products then work with max vs this team porting everything to max one at a time
  • Features + automatic telemetry
    1. interpret the code and list all the major features
    2. automatically instrument every feature with events, session replay and errors (?)
    3. create relevant insights automatically
  • Suggestions of what to build
    • use data from errors / analytics / replays / github issues / zendesk tickets / feedback products / surveys / CRM data
    • "users are getting stuck using X feature, we suggest changing it like Y"
    • "your activation rate is low, we suggest changing the activation flow like this"
  • Write code, but ship it properly, does 3 things:
    1. Prompt -> code, there are lots of OS bits of software we could look at for doing this here
    2. "how would you like to ship this feature?" it should ask the questions a PM would ask. it then automatically creates either a flag, a self serve beta, an experiment or nothing. it could even once messaging exists set up emails to get feedback and then eventually to launch it publicly.
    3. instrumentation should get updated automatically

Longer term

We fork vs code and create a desktop app.

PMs use it to see suggestions of what to work on and to dig deeply into product data. Myabe even Designers use it to tweak the designs
Engineers use it to code the things the team should work on
Salespeople use it to figure out how customers are using products/trials and so on, and to create deals/operate it as a CRM
Marketing people use it to configure messaging campaigns, do conversion rate optimziation or to ask deep data questions and to configure ad campaigns
Support people use it to handle tickets, with AI powering the answers

Whereas teams used to have 10 developers, 10 salespeople, 10 marketing people, etc, they will just have 1 of each at the same revenue. This person will "configure" how the function operates, then AI will do all of it.

How do we launch this

Pretty much any of these distinct "products" in the extension adds enough value i'd argue we can launch with just one of them.

We have a huge launch thing happening in early May. This is the goal to promote the PostHog Editor.

Technical questions

Max displays data. How painful is it to do this in an extension? I think if we push editing via max / chat then we can avoid having to create ie interactive filters with the escape path being - click it, view the insight on the web to edit

How do we preview suggested new features to a user? is text enough? does any IDE do a good job of this?

Notably missing

after chatting with Hamed I don't think MCPs make a lot of sense here, which feels odd given the industry seems to want to go that way. The extension just means we can do way more as far as i can tell.

What it would look like

product features aren't listing well yet but you get the idea

Image
@jamesefhawkins
Copy link
Collaborator Author

(also Tim and I are very very happy to invest very seriously in this area, we could add lots of engineers as we need, but obviously that can make things way slower)

@sdnyco
Copy link

sdnyco commented Mar 21, 2025

I am a big fan of the idea, and we should invest all resources. No matter the cost.

@sdnyco
Copy link

sdnyco commented Mar 21, 2025

Also: I don't think I was invited to Paris.

@jamesefhawkins jamesefhawkins added the concept Ideas that need some shaping up still label Mar 21, 2025
@skoob13
Copy link
Contributor

skoob13 commented Mar 21, 2025

Some random feedback.

New Modes

Current IDEs only have SWE modes: chat (user <-> LLM), implement (agent: plan and code), and fix (code patch). Let's bring a new mode–Product Manager. LLMs are good at planning and implementing, so we can build a multi-agent system where the PM agent orchestrates the work of SWE agents based on the product-level data.

Session Replay

Session replay is extremely useful for this product and the most complex for LLMs at the same time. Replay uses rrweb, so the payload is huge. This makes it challenging to convert to natural language, and no existing solution can transcribe replay on a scale with the appropriate cost.

At the same time, users have real problems to solve with AI there:

  • How can I watch sessions on volume?
  • What should I watch?
  • How can I detect visual bugs?
  • What are the action items from this session?
  • (Sure, the session replay team has more insights here. That's what I've found so far.)

$0.005/session = 50k tokens on Mistral Small - impossible to feed the entire session. I've been thinking about clustering sessions based on the features so we can later process samples: collected events, sessions dropped from key metrics (such as a funnel for the checkout flow), ML-based feature extraction from payloads (no ideas yet), some algorithmic extraction from the JSON (such as pre-processing to transcribe first to consume fewer tokens).

Based on those features (tags), we can partially solve the volume problem so similar sessions can be grouped. More advanced processing can be done to group sessions based on their transcripts and extract actionable feedback: using a multimodal LLM to find visual problems, detecting errors affecting key metrics the most, understanding user intent and behavior, and more.

This is something we must start doing ASAP, as session replay seems to me to be the "glue" for Max and for the Editor. Other products are in a better position to be LLM-ready.

Product/Company Memory

There is a potential for product/company memory:

  • External memory: Zendesk, Github, Google Docs, or any API-based system.
  • Collected memory: codebase, analytics, session replay, error tracking, surveys, CRM, etc.
  • Provided memory: user-provided info from conversations, explicitly saved, etc.

We could make the information consumable by the agents by collecting a memory bank, which will greatly help with the decision-making, such as proposing features to build, retrieving user requests, etc.

Suggestions of what to build

LLMs are really bad at decision-making. I would start by indexing the customer requests, operating on top of the roadmap, or key metrics. Those resources are more likely to contain actionable feedback, so LLMs won't struggle. Otherwise 👀

MVP

I think points 1 and 2 are realistic for the hackathon. We're ready to go with taxonomy query runners and the RAG pipeline for the event instrumentation.

Technical Questions

Max displays data. How painful is it to do this in an extension? I think if we push editing via max / chat then we can avoid having to create ie interactive filters with the escape path being - click it, view the insight on the web to edit

We can render insights and filters into a Webview component (or actually any UI we have). The UI won't be perfect in some cases but should work.

@HamedMP
Copy link
Contributor

HamedMP commented Mar 21, 2025

love the combination of code + app/product/behavioral data, that's how future products will be built and ai agents can do improve products autonomously based on user feedback and app usage.

to put the same idea as in the desc. visually, i see ide extension, desktop app (and even mcp), as a new interfaces we are introducing for posthog customers (first focusing on it as a new ui/ux, and then ai to boost it and make it 100x more useful)

Image

currently posthog insights are available outside of where our core icp hangs out the most, i.e. their code editor. which means we need to:

  1. onboard them to the app
  2. guide them to integrate sdk
  3. go back to the app to see if everything is correct
  4. come back to the app to see the usage (hopefully they integrated it properly and tracked important things)
  5. make conclusions on what to do

the process 👆🏼 is time consuming, error prone and manual.

imagine a scenario where:

  1. init your next big idea (or open it in your favorite code editor)
  2. connect posthog extension to your account (or sign up to it directly from your editor)
  3. extension functionalities (some overlapping with the issue, just rephrasing it differently):
    1. integrate it to your favorite framework (any that our sdk supports)
    2. press a button to add events in core functions (with ai ✨)
    3. instrument feature flags, error tracking, llm observability,...
  4. see list of your events and their metrics, both on the side panel as well as when hovering on the line where they are triggered
    1. similar to gitlens extension e.g. Image
    2. or jest test extension Image
  5. see the session recording for the path you are in
  6. (advanced): see recommendation of areas to improve: unused feature, buggy pages, drop offs on the funnel, ...
    1. this potentially should live in our (web) app as well, extension is only a ui + an llm-equipped smart agent to help the customer.
  7. (super advanced): let "posthog ai" measure, deploy, analyze and iterate on features (in the well specified boundaries of files/features it can do).

to summarize, i think it makes a lot of sense to move closer to where our icp spend 90% of their time, and with the extension or ide, we can make it to 99%, helping them to only focus on big picture and core functionality and we help them to:

  1. see how their app is performing
  2. suggest how to improve it – this is where ai ✨ comes into place

p.s. in the future scenario, a potential "extension" to the platform can be the deployment infra, download posthog ide, build product, ship it, sell it (with our crm), measure it, improve it, all in one platform 🚀.

@HamedMP
Copy link
Contributor

HamedMP commented Mar 21, 2025

Let's bring a new mode–Product Manager.

i love this @skoob13. do you want this to be ai-first from get go, or bring in the data (similar to our web app) for human in the loop feedback, and automate more and more.

main reason i'm thinking bringing in data can be done first is that:

  • we need to have the "product manager" functionality in the web app (or somewhere) as well to be able to bring it to the ide. having access to the codebase doesn't add value for the pm, the decision is purely based on data and insights.

hence, we can focus on:

  1. what access to the source code enable us, that was not possible before
  2. (in the meantime, continue super charging our web app, with tools like ai pm)
  3. use the web app features + source code access to do even cooler stuff, e.g. auto-commit, deploy, measure and repeat flows.

p.s. a random idea: another "interface" for posthog can be github app and actions, similar to graptile comments

@annikaschmid
Copy link

annikaschmid commented Mar 21, 2025

@skoob13

I've been thinking about clustering sessions based on the features so we can later process samples: collected events, sessions dropped from key metrics (such as a funnel for the checkout flow), ML-based feature extraction from payloads (no ideas yet), some algorithmic extraction from the JSON (such as pre-processing to transcribe first to consume fewer tokens).

This is something we must start doing ASAP, as session replay seems to me to be the "glue" for Max and for the Editor. Other products are in a better position to be LLM-ready.

The Replay team has done some (not so successful work) with clustering in the past and we let it rest a bit, but we now have two super AI keen people in the team, and while we still have to do our planning, this will likely be one of our key objectives for next quarter.

We also have an offsite next week, where we were planning to speak about this. Since Replay is already integrated into Max AI, we can chip away at making Replay more intelligent over time, and if Max AI is available in a code editor, Replay insights will be available as well.

But yeah, the key thing for Replay in an AI-first world is to answer:
V1. Show me recordings where users a struggling / experiencing bugs
V2: Improve my app / onboarding flow / etc based on what users are struggling with

paging @pauldambra @veryayskiy @sortafreel

@pauldambra
Copy link
Member

yep, i'd be amazed if we didn't end up with something around the "watch my" or "find my" recordings for Q2

@EDsCODE
Copy link
Member

EDsCODE commented Mar 21, 2025

What do we think about the marketability/distributability of an extension? I've always found it odd that extensions aren't more popular (besides the typical ones you'd need such as language support). Or maybe, I'm totally unaware of the popularity. But especially with recent trends, it seems like people are more ready to try new/forked IDEs than extensions

@joshsny
Copy link
Contributor

joshsny commented Mar 21, 2025

This looks great, I'm excited to see how this progresses!

Couple of things I would personally love to see in terms of actionable feedback in a PostHog oriented IDE experience:

  • It would be amazing to see inline analytics in the IDE about how much users are interacting with the specific component I am looking at in a frontend. I think the inline experience will be the most powerful and more likely to be used.
  • It's a pain switching between a seperate error tracking product and your IDE, so it'd be nice to have the error data provided as context and have a one click "Fix Error" action in a similar fashion to existing actions you can take when selecting code.
Image
  • A "view replays that interact with this component" experience would be nice
  • We should automate the PostHog installation for users that come through the extension, and vica-versa for users that come via the website we should offer them to install the extension when we install PostHog for them
  • Creating flags & experiments from the sidebar would be a nice UX

Some things that could help us achieve this faster:

  • Error tracking is implementing source maps uploading for errors, this could give us a reliable way to match event / session / error data from production to your dev environment.
  • As @skoob13 mentioned, using a webview for showing some of the existing parts of the app UI will shortcut a lot of work
  • For popular frameworks that have a file oriented structure (e.g. NextJS), we can map the current file to a group of pages for surfacing session and analytics data in the absence of source maps

Some things that don't feel that useful:

  • It seems less likely to me that high-level strategy decisions are going to be made in an IDE context, so any insights that are being generated should be contextual to the file I am currently looking at
  • I think we can do a lot to improve the experience of developers building products within an IDE experience before we try and get non-developers using an IDE. The latter is a large behavioural shift; I think we're far away from "everyone codes" in general.

@joshsny
Copy link
Contributor

joshsny commented Mar 21, 2025

p.s. a random idea: another "interface" for posthog can be github app and actions, similar to graptile comments

@HamedMP I think this is well worth us exploring - it'd be nice to have an experience in Max AI where we upload the diff from the PR and it gives you a bunch of actionable comments based on the analytics of the things you are changing

@HamedMP
Copy link
Contributor

HamedMP commented Mar 21, 2025

  • see inline analytics in the IDE about how much users are interacting with the specific component

@joshsny oh, you sparked the idea that just like autocapture in web, we can do autocapture for components (similar to react dev component view) to measure each components performance/actions/bugs (connected with recording json data)

re. the pr diff idea, 💯%, i think that'd be really cool to hack together!

@annikaschmid
Copy link

we can do autocapture for components (similar to react dev component view) to measure each components performance/actions/bugs (connected with recording json data)

We have this concept in feature flags, well at least theoretically. The idea was for developers to highlight components in the code, that then would be tracked in PostHog automatically as a “feature”. TBH I haven’t looked into it much whether this is used at all, it lacked strong engineer ownership when we built it, so we never really got somewhere with it. I am not sure I can directly link to it (on my phone), but it’s the “Method 2: Using the PostHogFeature component” section in our React docs: https://posthog.com/docs/libraries/react

Anything we can do to make this easier/automatic to set up in an AI first world would be awesome, I think it was a hard concept to teach to engineers and learn how to use, at the right point in time.

@Twixes Twixes added the epic label Mar 24, 2025
@Twixes
Copy link
Member

Twixes commented Mar 24, 2025

@EDsCODE's

But especially with recent trends, it seems like people are more ready to try new/forked IDEs than extensions

💯 We could start with an extension, but I'm certain that's not where the VC scale win is. We should fork VS Code from the get go and go against Curs0r et al. This is significantly more ambitious – to be competitive, we'll need not just a great AI agent, but also excellent AI autocomplete – but that's how we build the ultimate Product OS for the AI-powered future of building. Extremely ambitious to be honest, but our tagline is "how developers build successful products".

We've already put a surprising amount of thought into the IDE UX with @corywatilo's IDE-like designs for the web app and we don't actually need to worry about distribution, especially with our brand. But we must 100% create unprecedented value.

In coding, we're fortunate to have a late mover's advantage. We can see what works out there. Still a major investment, but I'm confident we can ship a great experience there and embed into the builder's relationship with code.

Then, when we own the relationship, we get into novel territory. In-IDE integration with analytics, flags, error tracking, etc. becomes 100% natural, and you can ask the AI agent to tackle any of these problems right away. Imagine finding the most common backend error, and solving it in one click without ever leaving the IDE. Then you see a funnel drop-off problem – the agent addresses it right away, and after checking that the event is instrumented correctly, it drafts a proposed UX change and the right experiment to A/B test it.

Short framing:
Curs0r/W1ndsurf helps you build any random thing.
PostHog's IDE helps you build the successful thing.

@jamesefhawkins
Copy link
Collaborator Author

jamesefhawkins commented Mar 24, 2025

We should fork VS Code from the get go and go

The only reason i've said not to do this is i've herad it's hassle, if you're up for it let's do this. it'd be much better positioning wise.

Can we build it into a fork of vscode as an extension so we can also distribute into curs0r etc too? or does this limit us?

@jamesefhawkins
Copy link
Collaborator Author

I've been thinking about clustering sessions based on the features

we could just sample and assume we'll catch most stuff this way anyway that affects the most users

@joethreepwood
Copy link
Contributor

Curs0r/W1ndsurf helps you build any random thing.
PostHog's IDE helps you build the successful thing.

Love this. I can excitingly see a whole world where IDE becomes what people think of as our main product and fully cements the Product OS vision.

@pawel-cebula
Copy link
Contributor

pawel-cebula commented Mar 24, 2025

A lot of super exciting stuff in here!

Sharing some raw high-level thoughts about key considerations and trade-offs involved between areas I'd consider important here - ICPs, interfaces, and business models.

ICP vs interface

  • In this context, defining the ICP narrowly feels even more crucial than for the existing PostHog products
  • So far, our primary interface has been the web app - which you'd typically be using to access products like ours whether you're a founder, engineer, PM, marketer, designer, sales or support agent
  • When we move towards helping people actually build products, this starts to vary quite a bit by ICP/function
  • We already see this divergence in the context of AI coding tools, e.g.
    • Engineers tend to pick AI-supported IDEs, preferably forks of existing ones rather than new ones, to make the transition easier (e.g. Cursor, Windsurf)
    • Non-engineers tend to pick newer, standalone environments that abstract a lot of complexity (e.g. Replit, Lovable)
  • There is one interface that might go against this - MCPs could potentially integrate with all other interfaces if they end up becoming the long-term standard

Interface trade-offs

  • I believe we want to continue doubling down on our existing ICP (founders, product engineers), which IMO narrows down the choice to interfaces that meet them where they already are, e.g.
    • MCP
    • Extension
    • IDE
  • As we go down this list
    • Stickiness and switching costs (to/from) increase
    • Value capture opportunities improve
    • Reach becomes more limited
    • Development effort growths exponentially

Business model vs interface

  • The key question: Where does the value come from?
    • Are we monetizing the tool itself (e.g. IDE)?
    • Or are we using it to drive adoption and usage of our existing and future products?
  • AI IDEs have been able to monetize very well through subscriptions but they're operating at very thin margins. They can capture value from everyone (incl. hobbyists) but undermonetize on the successful ones.
  • Our goal is to help developers build successful products. Therefore I'd see our efforts in this area as a way to get more users and grow usage with existing users, and monetize through that, as they scale and become more successful.
  • This also potentially helps to reduce the concerns around monetizing some of the interfaces - extensions are famously difficult to monetize. MCPs will probably be similar in that regard. But if the goal is to grow long-term usage, then this is not a concern.

With this in mind, I'd lean toward starting with a lighter interface that can integrate well into existing tools (extensions, MCPs)

  • We can still deliver unique value in an area where we have a unique advantage (vs having to compete on core AI-assisted IDE experience)
  • We can get more validation that we're on the right track without making too many assumptions upfront
  • It doesn't prevent us from evolving towards becoming a full-fledged IDE (and we could re-use a lot of the building blocks we build along the way)

One area that I might be underestimating is the limitations that certain interfaces will impose on the UX we can deliver. But there seem to be so many low-hanging fruits in integrating product analytics/sessions/error tracking (and many others) into the development experience that even with the more basic interfaces, we should be able to deliver a lot of unique value.

@danielbachhuber
Copy link
Contributor

We could start with an extension, but I'm certain that's not where the VC scale win is. We should fork VS Code from the get go and go against Curs0r et al.

Could we do both? VS Code fork team vs. VS Code extension team and see which can win the greatest adoption?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concept Ideas that need some shaping up still epic
Projects
None yet
Development

No branches or pull requests