Papers
arxiv:2603.24624

ReSyn: A Generalized Recursive Regular Expression Synthesis Framework

Published on Jun 13
· Submitted by
Seongmin Kim
on Jun 19
Authors:
,
,
,

Abstract

A divide-and-conquer framework named ReSyn enhances regex synthesis accuracy by decomposing complex problems, combined with a parameter-efficient synthesizer called Set2Regex that handles example permutation invariance.

Existing Programming-By-Example (PBE) systems often rely on simplified benchmarks that fail to capture the high structural complexity of real-world regexes, such as deeper nesting and frequent use of union operations. To overcome the resulting performance drop, we propose ReSyn, a synthesizer-agnostic divide-and-conquer framework that decomposes complex synthesis problem into manageable sub-problems. We also introduce Set2Regex, a parameter-efficient synthesizer capturing the permutation invariance of examples. Experimental results demonstrate that ReSyn significantly boosts accuracy across various synthesizers, and its combination with Set2Regex establishes a new state-of-the-art on challenging real-world benchmark. The complete source code, datasets, and pre-trained model checkpoints are publicly available at https://github.com/mrseongminkim/ReSyn.

Community

Paper author Paper submitter

We've released everything to use and build on ReSyn from the Hub:

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Neat approach to regex synthesis. PBE tools usually struggle once things get nested or rely heavily on unions, so the divide-and-conquer strategy here sounds like a solid way to handle that complexity.

I'm curious how the Set2Regex component handles cases where the provided examples are ambiguous or don't fully define the intended pattern. Does the permutation invariance help prune the search space significantly when the input set is small?

I made a podcast on it with ResearchPod, it makes it easy to get the key concepts on the go:
https://researchpod.app/episode/0eb2f81d-649c-4cf5-bd64-d1823b2bc89e

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.24624
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 5

Browse 5 models citing this paper

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.24624 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.