A Perplexing JavaScript Parsing Puzzle, by @hillelwayne.com:
A Perplexing JavaScript Parsing Puzzle, by @hillelwayne.com:
This is going to make life so much easier: URL Pattern API
https://developer.mozilla.org/en-US/docs/Web/API/URL_Pattern_API
@dave how hard would it be to try?
I think you'd be able to get faster results, with very little overhead.
If not and your work is ready for another tester, I have the hardware.
Want to parse/validate open source licenses in Rust? Check this out.
**spdx**: Helper crate for SPDX expressions.
Docs: https://docs.rs/spdx
Did #AI write better Java? The experiment's back—#1BRC tests #IDE battles & hallucination. #Zencoder #JetBrains shine, but the flaws are surprising. All about context, integration #parsing. Read Steve Poole´s article now: https://javapro.io/2025/01/15/maximize-productivity-with-ai-tools-in-java-development/
ast testing with unordered dicts somewhere in the logic is "fun"
Does anyone have experience with #parsing partially overlapping text regions? For example if we have #markdown we could have something like `**only bold _italic and bold** only italic_`. Only that in #orgmode one has 6 or more different inline markups. I would like to parse this stuff, but it seems impossible to describe with some grammar, because the grammar usually represents a tree and branches don't overlap. Starting to wonder, if a grammar is possible at all. Does Emacs use a custom parser?
Having a problem with #guile #peg #parsing library: https://lists.gnu.org/archive/html/guile-user/2024-10/msg00002.html
Does anyone know, whether it is possible using that library to have mutually recursive pattern definitions? Or is that library kind of broken?
Maybe these resources are interesting to you as well
Introduction to Compilers and Language Design - by Douglas Thain (haven't read yet)
Strumenta (<-- an absolute gem!)
e.g. these articles
A #Guide to #Parsing: #Algorithms and Terminology:
https://tomassetti.me/guide-parsing-algorithms-terminology/
A #tutorial on how to write a #compiler using #LLVM:
https://tomassetti.me/a-tutorial-on-how-to-write-a-compiler-using-llvm/
Weeknote! #SwiftUI, #parsing, progress on the async chapter of The #Rust Programming Language book, a second pass on a bit of orchestra #music, StaffPlusNY talk drafting, going on @changelog, and @robinsloan’s Moonbound!
https://v5.chriskrycho.com/journal/weeknote-june-17-21-2024/
Trying something new with this—I’ll give it a month or so and see if it sticks.
Why? Well, it seems good to show some *progress* for the things I am working on—since I don’t have the usual jobby-job folks to share that work with!
Context free grammars (CFG) are better than parsing expression grammars (PEG), because CFGs represent how we think.
Parser combinators are similar to PEGs, so they are worse than CFGs, too.
So, don't use Rust libraries nom
, combine
. Use lalrpop
.
Don't use Haskell libraries parsec
, gigaparsec
, attoparsec
, megaparsec
, trifecta
. Use Earley
, happy
.
See more detailed story in my new article https://safinaskar.writeas.com/this-is-why-you-should-never-use-parser-combinators-and-peg .
The story also includes some cases, where PEG and parser combinators may still be useful. Also, the article gives links to my Haskell parsing libraries.
I have some machine generated dsp-like pseudo code in a text file with a lot of superfluous brackets and coefficients that can be combined. In order to make it more readable I’d like to simplify it so I need to build an abstract syntax tree and some tree modifiers. Parsing the file is the first step; I think I’ll use ANTLR/C++for that, or should I use #Python? #development #parsing #cpp
Getting ready for the Decoder part, reviewing a few ways break up a String
https://www.whynotestflight.com/excuses/wait-how-do-i-scan-text-again/
The RegExBuilder regex is surprisingly speedy in context!
Parsing crates:
chumsky | A #parser library for humans with powerful error recovery
https://crates.io/crates/chumsky
winnow | making #parsing a breeze
https://crates.io/crates/winnow
---------------------------------------
Crates in the #async ecosystem:
pollster | an incredibly minimal async executor for #Rust that lets you block a thread until a future completes
https://crates.io/crates/pollster
smol | A small and fast async runtime
Computer scientist: Look at this fascinating new #parsing technique!
Compiler writer: Wow it’s beautiful!
Computer scientist: So you’re going to use it now, right?
Compiler writer: …
Computer scientist: You’re going to continue writing recursive descent parsers, maybe shunting yard sometimes at best, aren’t you?
Compiler writer: Yeah
Computer scientist: Yeah
difftastic, https://github.com/Wilfred/difftastic.
Difftastic is a CLI diff tool that compares files based on their syntax, not line-by-line. Difftastic produces accurate diffs that are easier for humans to read.
It supports many languages and is compatible with Git.
I'll actually just make my recursive descent parser module a whole new project. I think it deserves that. Perhaps other people need to parse config files and want to use my kick-ass grammar for it. The code is found here:
https://github.com/akyuute/mybar/blob/dev/mybar/parse_conf.py
If you've had a dull day, I have just the weird bug for you: DevTools getting confused between `tan` and `tan()`
Test on #CodePen https://codepen.io/thebabydino/pen/JjxvKLe
Chrome bug https://bugs.chromium.org/p/chromium/issues/detail?id=1504563
Safari bug https://bugs.webkit.org/show_bug.cgi?id=265254
Thanks @bramus for filing the bugs.
Doesn't happen in Firefox, but happens in both Chrome & Safari.
Bison is like weird flex but okay.