Why I built the first browser benchmark that gives the true power of your device

by KirkGC

Hey IH!

Most benchmarks are single-tasking relics. In 2025, we are running local AI models (LLM, Recognition, etc.) and complex data processing at the same time in the browser, demanding a new standard for performance measurement.

I've built SpeedPower.run to solve this critical need for modern, comprehensive benchmarking. Instead of a single, isolated task, our system runs a rigorous set of seven concurrent benchmarks, including core JavaScript operations, high-frequency data exchange simulations, and the execution of five different AI models. This process is specifically designed to force concurrent execution across every available CPU and GPU core in your device, simulating a real-world, multi-tasking environment.

Our benchmark is constructed using the most popular and cutting-edge web AI stack: TensorFlow.js and Transformers.js, ensuring relevance and fidelity to applications being built today.

The Challenge: Traditional scores fail to capture this complexity. Is our overall geometric mean score accurately and transparently reflecting the true concurrent processing power of your browser? We believe our holistic approach provides the most accurate answer.

The test is pure and simple: No network interference, no installation or external dependencies—just a raw measurement of your device's compute capabilities as seen by the browser. See your comprehensive score and performance breakdown here: https://speedpower.run/?indiehacker

I'll be here all day to discuss the specifics of our multi-tasking scoring logic, the selection of the seven benchmarks, and how we derived the geometric mean to best represent concurrent power.

KirkGC

posted to

Product Launch

on January 29, 2026

Say something nice to kirkbreton…

Post Comment

3

Another benchmark? How is this different from JetStream 2 or Speedometer? I feel like we’ve solved browser speed.

TheFunnyDev

·
4 hours ago
·
Reply
1. 1
  
  I just checked the 'About' page. They are using Transformers.js v3 for the LLM and Speech tests. That uses WebGPU compute shaders for parallel inference. If you're comparing this to old-school JS benchmarks, you're missing the point. We're talking about asynchronous command queues in the browser. I'd be curious to see how the 'Score Stability' handles thermal throttling over multiple runs.
  
  GreenInfraGuru
  
  ·
  4 hours ago
  ·
  Reply
  1. 1
    
    Spot on! Thermal throttling is the 'invisible variable' in mobile benchmarking.
    
    We don't normalize for it because we want to measure peak real-world capacity. However, that’s exactly why we implemented the 'Warm-up Execution.' We prime the JIT and compile the shaders first, so we aren't measuring 'startup coldness.'
    
    If you run the benchmark three times in a row on a fanless MacBook Air, you will see the score dip. To us, that’s a feature, not a bug, it reveals the device's true sustained compute limit for modern, heavy AI workloads.
    
    kirkbreton
    
    ·
    4 hours ago
    ·
    Reply
  2. 1
    
    Good catch. They mention a 'Warm-up Execution' to prime the caches and JIT, but they also say to run it several times for the maximum score. It seems they are measuring 'Peak Capacity' rather than average sustained performance, which makes sense for bursty AI tasks in a web app.
    
    TheFunnyDev
    
    ·
    4 hours ago
    ·
    Reply
2. 1
  
  Speedometer tests how fast a page feels; this tests if your browser can actually handle local LLMs 😉. Most benchmarks are single-threaded relics.
  
  kirkbreton
  
  ·
  4 hours ago
  ·
  Reply
  1. 1
    
    But if I'm running an LLM, isn't that almost entirely a GPU bound task? Why does the main thread communication even matter that much once the model is loaded into VRAM?
    
    TheFunnyDev
    
    ·
    4 hours ago
    ·
    Reply
    1. 1
      
      That’s a common misconception we’re trying to highlight! You’re right that the matrix multiplication happens on the GPU, but an LLM in a browser isn't a 'set it and forget it' process.
      
      With Transformers.js v3, orchestration, tokenization, KV cache management, and autoregressive decoding still require constant 'handshakes' between the worker and the main thread. If your 'Exchange' performance is poor, the GPU sits idle waiting for the next instruction. We specifically included the SmolLM2-135M test to show that even a 'small' model can be bottlenecked by how efficiently the browser moves data between threads.
      
      kirkbreton
      
      ·
      4 hours ago
      ·
      Reply
2

My phone's browser got a better score than my 5-year-old desktop. That feels totally unbelievable, haha. I thought the desktop would crush it with more cores. What gives with the mobile anomaly?

UXNinja77

·
4 hours ago
·
Reply
1. 1
  
  That's what we call the "Parallel Paradox," and it's what we find fascinating! We've seen some modern mobile ARM chips show better task switching efficiency than older x86 desktops due to more aggressive, thermal aware scheduling in the mobile browser engines. Raw clock speed isn't the whole story anymore. Our scoring uses a weighted geometric mean, where JavaScript and Exchange efficiency are key factors.
  
  kirkbreton
  
  ·
  4 hours ago
  ·
  Reply
1

Is this test network-dependent at all? Do I need a gigabit connection for a high score? Always skeptical of benchmarks that don't clearly state that.

Ilearncoding

·
3 hours ago
·
Reply
1. 1
  
  A totally fair skepticism. Absolutely not. This is a Zero Network Interference test. All the ~350MB of data (AI models and assets) is fully pre loaded into the browser memory before the timer starts. This is a pure local compute test, not a network speed test.
  
  kirkbreton
  
  ·
  3 hours ago
  ·
  Reply
1

I love the 'Exchange' benchmark. Most devs ignore the cost of moving data between the main thread and workers. It’s the silent killer of performance.

ScottMcgee

·
4 hours ago
·
Reply
1. 1
  
  Exactly. If you're building a realtime speech-to-text app using their moonshine tiny test, your GPU inference might be fast, but if your OffScreen Canvas or Buffer transfers are slow, the UX feels laggy. This is the first tool I've seen that quantifies the IPC overhead specifically.
  
  kirkbreton
  
  ·
  4 hours ago
  ·
  Reply