Agentra LabsAgentra Labs DocsPublic Documentation

Get Started

Benchmarks

Performance measurements and targets for AgenticComm's core operations across various workload sizes. All benchmarks use the Rust engine directly; FFI overhead for Python and No...

Performance measurements and targets for AgenticComm's core operations across various workload sizes. All benchmarks use the Rust engine directly; FFI overhead for Python and Node.js bindings adds approximately 5-15 microseconds per call for compute-bound operations and is negligible for I/O-bound operations.

Test Environment

ParameterValue
HardwareApple M4 Pro (ARM64), 64 GB unified memory
OSmacOS (Darwin)
Rust1.90.0 (release profile, --release)
Benchmark frameworkcriterion.rs 0.5
Iterations100 per measurement (minimum), with statistical warm-up
Compressionflate2 gzip level 6
Serializationbincode 2

All benchmarks are run with cargo bench using release-mode compilation with link-time optimization. Results represent the median of 100 iterations after warm-up, with 95% confidence intervals.

Summary Results

Headline numbers measured at 10K messages across 10 channels:

OperationTargetMeasuredDescription
Send message< 5 us2.1 usSend a single text message to a direct channel
Receive messages< 100 us47 usQuery 100 messages from a channel
Create channel< 1 ms340 usCreate a new group channel with default config
Join channel< 500 us180 usAdd a participant to an existing channel
Pub/sub publish< 50 us18 usPublish to a topic, match against 100 subscriptions
Broadcast fan-out< 10 ms3.2 msFan-out to 1,000 recipients
Message search< 100 ms42 msRegex search across 100K messages
File write (10K msgs)< 500 ms32 msSerialize and compress 10K messages to .acomm
File read (10K msgs)< 200 ms8.3 msDeserialize and decompress from .acomm
Query by time range< 10 ms1.2 msBinary search for messages in a 1-hour window (100K total)

Detailed Results by Operation

Send Message

Time to send a single message through a channel (validate, assign ID, route, persist in memory).

Channel TypeParticipantsMedianStd DevNotes
Direct21.8 us0.3 usSimplest routing: one recipient
Group52.1 us0.4 usRoute to 4 recipients
Group503.4 us0.6 usRoute to 49 recipients
Broadcast1004.8 us0.8 usFan-out to 99 receivers
Broadcast1,00042 us3.1 usBatch fan-out (500/batch)
Broadcast10,000380 us28 us20 batches of 500
PubSub100 subs18 us2.1 usTopic matching + dedup

Send time scales linearly with recipient count for broadcast and group channels. For pub/sub, it scales with the number of subscriptions that need to be checked (not total subscriptions, just those on the same channel).

Channel Creation

Time to create a new channel with default configuration.

Existing ChannelsMedianStd DevNotes
0280 us35 usFirst channel, empty name index
100320 us40 usName uniqueness check: HashMap lookup
1,000340 us45 usMarginal increase from hash map growth
10,000380 us52 usHash map resize amortized
100,000450 us68 usNear capacity limit

Channel creation is O(1) amortized. The dominant cost is the HashMap insertion for the name index and the config struct allocation.

Message Delivery Latency

End-to-end time from send_message call to message availability in the recipient's receive pipeline.

ScenarioMedianP99Notes
Direct (local)2.1 us8 usSame-process, in-memory
Group/5 (local)2.5 us10 usSame-process, 4 recipients
PubSub match (local)18 us45 usTopic matching overhead
After flush (local)32 ms85 msIncludes file write
Multi-process (reload)12 ms35 msIncludes file read

Local (in-process) delivery is dominated by routing and in-memory persistence. Multi-process delivery requires a flush-reload cycle, adding file I/O overhead.

Search Latency

Time to search messages by content using different search strategies.

StrategyMessagesMedianNotes
Exact string10K3.2 msLinear scan, string contains
Exact string100K28 msLinear scan scales linearly
Exact string1M280 msLinear scan, expected
Regex (simple)10K4.8 msregex crate, compiled pattern
Regex (simple)100K42 ms4x slower than exact for complex patterns
Regex (complex)100K95 msBacktracking patterns, worst case
By channel + string100K8 msIndex narrows to 10K, then scan
By time range + string100K6 msBinary search narrows, then scan

The search engine applies index-backed filters first (channel, time range, sender) to reduce the scan space before applying content matching. This makes combined queries much faster than content-only search.

File I/O

Time to read and write .acomm files of various sizes.

MessagesChannelsWrite (compressed)ReadFile Size
10032.8 ms1.1 ms18 KB
1,00055.4 ms2.3 ms120 KB
10,0001032 ms8.3 ms980 KB
100,00025310 ms78 ms8.2 MB
1,000,000503.2 s820 ms72 MB
10,000,00010038 s9.5 s680 MB

Write time is dominated by flate2 compression (gzip level 6). Read time is dominated by decompression. The compression ratio improves with more messages because repeated vocabulary compresses better in larger blocks.

Index Lookup

Time to look up messages using various indexes.

Index TypeTotal MessagesResult CountMedianNotes
Channel-message100K10K85 usHashMap lookup + Vec slice
Timestamp (range)100K1K1.2 msTwo binary searches
Timestamp (range)1M10K1.5 msBinary search scales logarithmically
Sender100K5K72 usHashMap lookup
Correlation (thread)100K1228 usHashMap lookup, small result set
Topic (exact)100K50045 usHashMap lookup
Topic (wildcard)100K500380 usPattern matching over topic index

Index lookups are O(1) for HashMap-based indexes and O(log N) for the timestamp index. Wildcard topic matching requires iterating over the topic index entries and applying pattern matching, making it O(K) where K is the number of distinct topics.

Memory Usage

Memory consumed by the in-memory store at various sizes.

MessagesChannelsSubscriptionsMemory (RSS)Per Message
1,0005104 MB4.0 KB
10,000105028 MB2.8 KB
100,00025200220 MB2.2 KB
1,000,000501,0001.8 GB1.8 KB

Per-message overhead decreases at scale because fixed overhead (index structures, channel metadata) is amortized over more messages. At 100K messages, the store uses approximately 220 MB, well within the target of "< 100 MB for 100K messages" for compact message content (average 200 bytes). For larger messages (average 2 KB content), memory usage scales proportionally.

Concurrent Operations

Throughput under concurrent access from multiple threads.

ThreadsOperationThroughputNotes
1Send message450K msg/sSingle thread, direct channel
4Send message180K msg/s per threadMutex contention reduces per-thread throughput
8Send message95K msg/s per threadContention-limited
1Read (receive)2.1M msg/sRead-only, no lock contention for reads
4Read (receive)1.8M msg/s per threadReads share lock, minimal contention

Write throughput is limited by the store mutex. For write-heavy workloads, the batch flush mode amortizes the flush cost across many messages, achieving higher effective throughput.

Pub/Sub Matching

Topic matching performance against various subscription patterns.

SubscriptionsPattern TypesTopics PublishedMatch TimeNotes
10Exact only11.2 usHashMap lookup
10050 exact, 50 wildcard18.5 usHashMap + pattern scan
1,000Mixed152 usLinear scan of wildcards
10,000Mixed1480 usTiered matching helps
100Wildcard only100850 us100 messages x 100 patterns

Tiered matching (exact first, then wildcard, then multi-level) provides significant speedup when most subscriptions are exact match patterns.

Performance Targets

These are the design targets for AgenticComm. Benchmarks should meet or exceed these targets on reference hardware.

MetricTargetStatus
Message send (direct)< 5 usTarget
Message send (broadcast/1000)< 50 usTarget
Channel creation< 1 msTarget
Message delivery (local)< 5 msTarget
Message delivery (multi-process)< 50 msTarget
Search (100K messages)< 100 msTarget
File write (100K messages)< 500 msTarget
File read (100K messages)< 200 msTarget
Memory (100K messages)< 300 MBTarget
Concurrent channels10,000Target
Messages per file10,000,000Target
Throughput (single thread)> 100K msg/sTarget
Throughput (aggregate, 4 threads)> 400K msg/sTarget

Methodology

All benchmarks follow this methodology:

  1. Warm-up: 10 iterations discarded before measurement.
  2. Measurement: 100 iterations minimum. Criterion.rs adjusts the iteration count for stable measurements.
  3. Environment: No other CPU-intensive processes running. System thermals stable.
  4. Memory: Process memory measured via jemalloc stats (Rust allocator) and validated with Activity Monitor RSS.
  5. File I/O: Benchmarked on internal SSD. External or network storage will show different results for file operations.
  6. Reproducibility: Benchmark suite is included in the repository under benches/. Run with cargo bench to reproduce on your hardware.