Marshall Pierce
I write software.
I like working with multi-disciplinary teams on difficult problems with "best tool for the job" technology. I've
worked
with a lot of different tech over the years, and I try to work with other experts whose strong points complement where
I'm less experienced. If you've got that sort of problem, get in touch ↴.
You can contact me via email at (my first name @ this domain) or on Twitter. I like sharing my knowledge, so feel free to reach out if
you have any questions. Also, every once in a while I write a blog.
Interests
- Performance
- Cache friendliness, SIMD, GPU, and all that. I'm especially interested in finding ways to build
abstractions that provide ergonomic APIs for consumers without giving up the hard-won performance gains. A fast
routine that forces consumers to, say, allocate memory each time is a missed opportunity.
- Concurrency and parallelism
- As an industry we seem to be fascinatingly bad at building concurrent systems that are both fast and correct. I
think the languages Rust and Pony are taking interesting approaches for their respective domains, but
it's not just a matter of picking the right language.
- Machine learning and bioinformatics
- These have pretty fascinating applications in the real world, and are also in need of better software. They might
seem like totally separate fields, but from a software perspective, they're not as different as you might think.
They're both answering questions with significant computational effort on very large volumes of data, ideally across
many machines, and increasingly with the use of hardware acceleration (GPUs, TPUs, FPGAs,
...). The software landscape that domain experts have to choose from in these fields is still young, and to me it
looks like it could benefit from some tidy, fast abstractions.
(Also, machine learning is not so uncommon inside bioinformatics.)
- Security and crypto
- Double checking that input validation is done correctly is great, but addressing entire classes of bugs is
better (e.g. replacing seemingly never-ending bug fountains like OpenSSL with safer equivalents), so I'm excited
about projects like MesaLock and ring. I wouldn't call myself a cryptographer, but I know enough to
be concerned when I feel the need to reach for a cryptographic primitive.
- Distributed systems
- All the problems and opportunities of concurrent systems, but with more hierarchies of cache and looser coherency.
- Type systems
- We're still just scratching the surface of how compilers can help us make maintainable systems. Halide is a pretty interesting example of better human/compiler communication,
and in a very different way, so are languages like Idris.
- WebAssembly
- It's great for getting code running in a browser, but I'm even more optimistic about its utility as a sandboxing/plugin mechanism.
- Privacy
- This page, for instance, does not track you, and is served over HTTPS so your ISP can't inject content into it.
I'm glad to see increasing consumer awareness of the importance of privacy.
- Learning
- I'm nearly always working on something new. I move between digging deeper in domains I'm fairly comfortable with,
like moving from scalar performance tuning to x86 SIMD, and picking something totally different, like WebAssembly,
or bare metal programming for microcontrollers.
- Other stuff
- I don't only write software.
Open source
A few of my open source projects:
-
The Rust version of HdrHistogram. All too often,
systems are measured via numbers with little practical utility (and debatable mathematical significance) like "mean latency". HdrHistogram
is a useful data structure for building tools that banish such pseudo-metrics from your systems. It has a small,
constant memory footprint with very fast updates, and a compact encoding you could build a whole metrics pipeline around.
- Rust Base64. The de-facto base64 implementation for the
Rust ecosystem, and an opportunity to practice low level optimization. Do you need multiple GiB/s of
base64? Maybe not yet...
-
A parser for JVM hprof heap dumps. Created out of a need to efficiently process heap dumps that were too big for the existing heap dump analysis tools. A bit odd to write a tool for the JVM ecosystem in Rust, but it enabled some performance and memory footprint wins that are nice to have on billion-object heaps.
- Rust implementation of StreamVByte. StreamVByte is an approach to integer compression that is
SIMD-friendly with a C reference implementation. The Rust implementation is a way to explore Rust's SIMD support
as well as the ability to expose zero-cost abstractions. And, of course, if you need to read or write billions
of ints per second, it can do that too.
- Metrics integration with Guice. Automates capturing
Metrics data via Guice AOP.
- HdrHistogram-backed Reservoir for
Metrics. The default reservoir types in Metrics have various shortcomings; this one uses HdrHistogram
internally to provide fast, mathematically sound measurements.
- Integration between Jersey 2 and Metrics. A
flexible way to capture Metrics performance data for your Jersey 2 services.
- Rust library for accessing the ELF auxiliary vector.
libc arcana? Unsafe navigation of the process memory? This library has it all.
- Java URL builder. The URL building landscape in Java
is at best uneven. This library does it correctly.
For more, see the rest of my OSS on BitBucket and GitHub and scattered across various organizations I've been a part of.