SIGN IN SIGN UP

A utility-first CSS framework for rapid UI development.

0 0 0 TypeScript
2024-03-05 14:23:26 +01:00
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
version = 4
2024-03-05 14:23:26 +01:00
[[package]]
name = "aho-corasick"
version = "1.1.3"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8e60d3430d3a69478ad0993f19238d2df97c507009a52b3c10addcd7f6bcb916"
2024-03-05 14:23:26 +01:00
dependencies = [
"memchr",
]
[[package]]
name = "arrayvec"
version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7c02d123df017efcdfbd739ef81735b36c5ba83ec3c59c80a9d7ecc718f92e50"
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
[[package]]
name = "bexpand"
version = "1.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "045d7d9db8390cf2c59f39f3bd138f1962ef616b096d1b9f5651c7acba19e5a7"
dependencies = [
"itertools",
"nom",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "bitflags"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
version = "2.6.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
checksum = "b048fb63fd8b5923fc5aa7b340d8e156aec7ec02f0c78fa8a6ddc2613f6f71de"
2024-03-05 14:23:26 +01:00
[[package]]
name = "bstr"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
version = "1.11.3"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
checksum = "531a9155a481e2ee699d4f98f43c0ca4ff8ee1bfd55c31e9e98fb29d2b176fe0"
2024-03-05 14:23:26 +01:00
dependencies = [
"memchr",
"regex-automata 0.4.8",
2024-03-05 14:23:26 +01:00
"serde",
]
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
Improve internal DX around byte classification [1] (#16864) This PR improves the internal DX when working with `u8` classification into a smaller enum. This is done by implementing a `ClassifyBytes` proc derive macro. The benefit of this is that the DX is much better and everything you will see here is done at compile time. Before: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { ValidStart, ValidInside, OpenBracket, OpenParen, Slash, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } macro_rules! set_range { ($class:expr, $start:literal ..= $end:literal) => { let mut i = $start; while i <= $end { table[i as usize] = $class; i += 1; } }; } set_range!(Class::ValidStart, b'a'..=b'z'); set_range!(Class::ValidStart, b'A'..=b'Z'); set_range!(Class::ValidStart, b'0'..=b'9'); set!(Class::OpenBracket, b'['); set!(Class::OpenParen, b'('); set!(Class::Slash, b'/'); set!(Class::ValidInside, b'-', b'_', b'.'); table }; ``` After: ```rs #[derive(Debug, Clone, Copy, PartialEq, ClassifyBytes)] enum Class { #[bytes_range(b'a'..=b'z', b'A'..=b'Z', b'0'..=b'9')] ValidStart, #[bytes(b'-', b'_', b'.')] ValidInside, #[bytes(b'[')] OpenBracket, #[bytes(b'(')] OpenParen, #[bytes(b'/')] Slash, #[fallback] Other, } ``` Before we were generating a `CLASS_TABLE` that we could access directly, but now it will be part of the `Class`. This means that the usage has to change: ```diff - CLASS_TABLE[cursor.curr as usize] + Class::TABLE[cursor.curr as usize] ``` This is slightly worse UX, and this is where another change comes in. We implemented the `From<u8> for #enum_name` trait inside of the `ClassifyBytes` derive macro. This allows us to use `.into()` on any `u8` as long as we are comparing it to a `Class` instance. In our scenario: ```diff - Class::TABLE[cursor.curr as usize] + cursor.curr.into() ``` Usage wise, this looks something like this: ```diff while cursor.pos < len { - match Class::TABLE[cursor.curr as usize] { + match cursor.curr.into() { - Class::Escape => match Class::Table[cursor.next as usize] { + Class::Escape => match cursor.next.into() { // An escaped whitespace character is not allowed Class::Whitespace => return MachineState::Idle, // An escaped character, skip ahead to the next character _ => cursor.advance(), }, // End of the string Class::Quote if cursor.curr == end_char => return self.done(start_pos, cursor), // Any kind of whitespace is not allowed Class::Whitespace => return MachineState::Idle, // Everything else is valid _ => {} }; cursor.advance() } MachineState::Idle } } ``` If you manually look at the `Class::TABLE` in your editor for example, you can see that it is properly generated at compile time. Given this input: ```rs #[derive(Clone, Copy, ClassifyBytes)] enum Class { #[bytes_range(b'a'..=b'z')] AlphaLower, #[bytes_range(b'A'..=b'Z')] AlphaUpper, #[bytes(b'@')] At, #[bytes(b':')] Colon, #[bytes(b'-')] Dash, #[bytes(b'.')] Dot, #[bytes(b'\0')] End, #[bytes(b'!')] Exclamation, #[bytes_range(b'0'..=b'9')] Number, #[bytes(b'[')] OpenBracket, #[bytes(b']')] CloseBracket, #[bytes(b'(')] OpenParen, #[bytes(b'%')] Percent, #[bytes(b'"', b'\'', b'`')] Quote, #[bytes(b'/')] Slash, #[bytes(b'_')] Underscore, #[bytes(b' ', b'\t', b'\n', b'\r', b'\x0C')] Whitespace, #[fallback] Other, } ``` This is the result: <img width="1244" alt="image" src="https://github.com/user-attachments/assets/6ffd6ad3-0b2f-4381-a24c-593e4c72080e" />
2025-03-05 14:00:07 +01:00
[[package]]
name = "classification-macros"
version = "0.1.0"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "convert_case"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec182b0ca2f35d8fc196cf3404988fd8b8c739a4d270ff118a398feb0cbec1ca"
dependencies = [
"unicode-segmentation",
]
[[package]]
name = "crossbeam"
version = "0.8.4"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1137cd7e7fc0fb5d3c5a8678be38ec56e819125d8d7907411fe24ccb943faca8"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-channel",
"crossbeam-deque",
"crossbeam-epoch",
"crossbeam-queue",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-channel"
version = "0.5.13"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33480d6946193aa8033910124896ca395333cae7e2d1113d1fef6c3272217df2"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-deque"
version = "0.8.5"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "613f8cc01fe9cf1a3eb3d7f488fd2fa8388403e97039e2f73692932e291a770d"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-epoch",
"crossbeam-utils",
]
[[package]]
name = "crossbeam-epoch"
version = "0.9.18"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5b82ac4a3c2ca9c3460964f020e1402edd5753411d7737aa39c3714ad1b5420e"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-queue"
version = "0.3.11"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df0346b5d5e76ac2fe4e327c5fd1118d6be7c51dfb18f9b7922923f287471e35"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-utils",
]
[[package]]
name = "crossbeam-utils"
version = "0.8.20"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "22ec99545bb0ed0ea7bb9b8e1e9122ea386ff8a48c0922e43f36d45ab09e0e80"
2024-03-05 14:23:26 +01:00
[[package]]
name = "ctor"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "990a40740adf249724a6000c0fc4bd574712f50bb17c2d6f6cec837ae2f0ee75"
dependencies = [
"quote",
"syn",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "diff"
version = "0.1.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "56254986775e3233ffa9c4d7d3faaf6d36a2c09d30b20687e9f88bc8bafc16c8"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
[[package]]
name = "dunce"
version = "1.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "92773504d58c093f6de2459af4af33faa518c13451eb8f2b5698ed3d36e7c813"
2024-03-05 14:23:26 +01:00
[[package]]
name = "either"
version = "1.8.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7fcaabb2fef8c910e7f4c7ce9f67a1283a1715879a7c230ca9d6d1ae31f16d91"
[[package]]
name = "errno"
version = "0.3.9"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "534c5cf6194dfab3db3242765c03bbe257cf92f22b38f6bc0c58d59108a820ba"
2024-03-05 14:23:26 +01:00
dependencies = [
"libc",
"windows-sys 0.52.0",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "fast-glob"
version = "0.4.3"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0eca69ef247d19faa15ac0156968637440824e5ff22baa5ee0cd35b2f7ea6a0f"
dependencies = [
"arrayvec",
]
2024-03-05 14:23:26 +01:00
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
[[package]]
name = "fastrand"
version = "2.1.1"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e8c02a5121d4ea3eb16a80748c74f5549a5665e4c21333c6098f283870fbdea6"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
2024-03-05 14:23:26 +01:00
[[package]]
name = "globset"
Add `@source not` support (#17255) This PR adds a new source detection feature: `@source not "…"`. It can be used to exclude files specifically from your source configuration without having to think about creating a rule that matches all but the requested file: ```css @import "tailwindcss"; @source not "../src/my-tailwind-js-plugin.js"; ``` While working on this feature, we noticed that there are multiple places with different heuristics we used to scan the file system. These are: - Auto source detection (so the default configuration or an `@source "./my-dir"`) - Custom sources ( e.g. `@source "./**/*.bin"` — these contain file extensions) - The code to detect updates on the file system Because of the different heuristics, we were able to construct failing cases (e.g. when you create a new file into `my-dir` that would be thrown out by auto-source detection, it'd would actually be scanned). We were also leaving a lot of performance on the table as the file system is traversed multiple times for certain problems. To resolve these issues, we're now unifying all of these systems into one `ignore` crate walker setup. We also implemented features like auto-source-detection and the `not` flag as additional _gitignore_ rules only, avoid the need for a lot of custom code needed to make decisions. High level, this is what happens after the now: - We collect all non-negative `@source` rules into a list of _roots_ (that is the source directory for this rule) and optional _globs_ (that is the actual rules for files in this file). For custom sources (i.e with a custom `glob`), we add an allowlist rule to the gitignore setup, so that we can be sure these files are always included. - For every negative `@source` rule, we create respective ignore rules. - Furthermore we have a custom filter that ensures files are only read if they have been changed since the last time they were read. So, consider the following setup: ```css /* packages/web/src/index.css */ @import "tailwindcss"; @source "../../lib/ui/**/*.bin"; @source not "../../lib/ui/expensive.bin"; ``` This creates a git ignore file that (simplified) looks like this: ```gitignore # Auto-source rules *.{exe,node,bin,…} *.{css,scss,sass,…} {node_modules,git}/ # Custom sources can overwrite auto-source rules !lib/ui/**/*.bin # Negative rules lib/ui/expensive.bin ``` We then use this information _on top of your existing `.gitignore` setup_ to resolve files (i.e so if your `.gitignore` contains rules e.g. `dist/` this line is going to be added _before_ any of the rules lined out in the example above. This allows negative rules to allow-list your `.gitignore` rules. To implement this, we're rely on the `ignore` crate but we had to make various changes, very specific, to it so we decided to fork the crate. All changes are prefixed with a `// CHANGED:` block but here are the most-important ones: - We added a way to add custom ignore rules that _extend_ (rather than overwrite) your existing `.gitignore` rules - We updated the order in which files are resolved and made it so that more-specific files can allow-list more generic ignore rules. - We resolved various issues related to adding more than one base path to the traversal and ensured it works consistent for Linux, macOS, and Windows. ## Behavioral changes 1. Any custom glob defined via `@source` now wins over your `.gitignore` file and the auto-content rules. - Resolves #16920 3. The `node_modules` and `.git` folders as well as the `.gitignore` file are now ignored by default (but can be overridden by an explicit `@source` rule). - Resolves #17318 - Resolves #15882 4. Source paths into ignored-by-default folders (like `node_modules`) now also win over your `.gitignore` configuration and auto-content rules. - Resolves #16669 5. Introduced `@source not "…"` to negate any previous rules. - Resolves #17058 6. Negative `content` rules in your legacy JavaScript configuration (e.g. `content: ['!./src']`) now work with v4. - Resolves #15943 7. The order of `@source` definitions matter now, because you can technically include or negate previous rules. This is similar to your `.gitingore` file. 9. Rebuilds in watch mode now take the `@source` configuration into account - Resolves #15684 ## Combining with other features Note that the `not` flag is also already compatible with [`@source inline(…)`](https://github.com/tailwindlabs/tailwindcss/pull/17147) added in an earlier commit: ```css @import "tailwindcss"; @source not inline("container"); ``` ## Test plan - We added a bunch of oxide unit tests to ensure that the right files are scanned - We updated the existing integration tests with new `@source not "…"` specific examples and updated the existing tests to match the subtle behavior changes - We also added a new special tag `[ci-all]` that, when added to the description of a PR, causes the PR to run unit and integration tests on all operating systems. [ci-all] --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-25 15:54:41 +01:00
version = "0.4.16"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Add `@source not` support (#17255) This PR adds a new source detection feature: `@source not "…"`. It can be used to exclude files specifically from your source configuration without having to think about creating a rule that matches all but the requested file: ```css @import "tailwindcss"; @source not "../src/my-tailwind-js-plugin.js"; ``` While working on this feature, we noticed that there are multiple places with different heuristics we used to scan the file system. These are: - Auto source detection (so the default configuration or an `@source "./my-dir"`) - Custom sources ( e.g. `@source "./**/*.bin"` — these contain file extensions) - The code to detect updates on the file system Because of the different heuristics, we were able to construct failing cases (e.g. when you create a new file into `my-dir` that would be thrown out by auto-source detection, it'd would actually be scanned). We were also leaving a lot of performance on the table as the file system is traversed multiple times for certain problems. To resolve these issues, we're now unifying all of these systems into one `ignore` crate walker setup. We also implemented features like auto-source-detection and the `not` flag as additional _gitignore_ rules only, avoid the need for a lot of custom code needed to make decisions. High level, this is what happens after the now: - We collect all non-negative `@source` rules into a list of _roots_ (that is the source directory for this rule) and optional _globs_ (that is the actual rules for files in this file). For custom sources (i.e with a custom `glob`), we add an allowlist rule to the gitignore setup, so that we can be sure these files are always included. - For every negative `@source` rule, we create respective ignore rules. - Furthermore we have a custom filter that ensures files are only read if they have been changed since the last time they were read. So, consider the following setup: ```css /* packages/web/src/index.css */ @import "tailwindcss"; @source "../../lib/ui/**/*.bin"; @source not "../../lib/ui/expensive.bin"; ``` This creates a git ignore file that (simplified) looks like this: ```gitignore # Auto-source rules *.{exe,node,bin,…} *.{css,scss,sass,…} {node_modules,git}/ # Custom sources can overwrite auto-source rules !lib/ui/**/*.bin # Negative rules lib/ui/expensive.bin ``` We then use this information _on top of your existing `.gitignore` setup_ to resolve files (i.e so if your `.gitignore` contains rules e.g. `dist/` this line is going to be added _before_ any of the rules lined out in the example above. This allows negative rules to allow-list your `.gitignore` rules. To implement this, we're rely on the `ignore` crate but we had to make various changes, very specific, to it so we decided to fork the crate. All changes are prefixed with a `// CHANGED:` block but here are the most-important ones: - We added a way to add custom ignore rules that _extend_ (rather than overwrite) your existing `.gitignore` rules - We updated the order in which files are resolved and made it so that more-specific files can allow-list more generic ignore rules. - We resolved various issues related to adding more than one base path to the traversal and ensured it works consistent for Linux, macOS, and Windows. ## Behavioral changes 1. Any custom glob defined via `@source` now wins over your `.gitignore` file and the auto-content rules. - Resolves #16920 3. The `node_modules` and `.git` folders as well as the `.gitignore` file are now ignored by default (but can be overridden by an explicit `@source` rule). - Resolves #17318 - Resolves #15882 4. Source paths into ignored-by-default folders (like `node_modules`) now also win over your `.gitignore` configuration and auto-content rules. - Resolves #16669 5. Introduced `@source not "…"` to negate any previous rules. - Resolves #17058 6. Negative `content` rules in your legacy JavaScript configuration (e.g. `content: ['!./src']`) now work with v4. - Resolves #15943 7. The order of `@source` definitions matter now, because you can technically include or negate previous rules. This is similar to your `.gitingore` file. 9. Rebuilds in watch mode now take the `@source` configuration into account - Resolves #15684 ## Combining with other features Note that the `not` flag is also already compatible with [`@source inline(…)`](https://github.com/tailwindlabs/tailwindcss/pull/17147) added in an earlier commit: ```css @import "tailwindcss"; @source not inline("container"); ``` ## Test plan - We added a bunch of oxide unit tests to ensure that the right files are scanned - We updated the existing integration tests with new `@source not "…"` specific examples and updated the existing tests to match the subtle behavior changes - We also added a new special tag `[ci-all]` that, when added to the description of a PR, causes the PR to run unit and integration tests on all operating systems. [ci-all] --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-25 15:54:41 +01:00
checksum = "54a1028dfc5f5df5da8a56a73e6c153c9a9708ec57232470703592a3f18e49f5"
2024-03-05 14:23:26 +01:00
dependencies = [
"aho-corasick",
2024-03-05 14:23:26 +01:00
"bstr",
"log",
"regex-automata 0.4.8",
"regex-syntax 0.8.5",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "globwalk"
version = "0.9.1"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0bf760ebf69878d9fd8f110c89703d90ce35095324d1f1edcb595c63945ee757"
2024-03-05 14:23:26 +01:00
dependencies = [
"bitflags",
Add `@source not` support (#17255) This PR adds a new source detection feature: `@source not "…"`. It can be used to exclude files specifically from your source configuration without having to think about creating a rule that matches all but the requested file: ```css @import "tailwindcss"; @source not "../src/my-tailwind-js-plugin.js"; ``` While working on this feature, we noticed that there are multiple places with different heuristics we used to scan the file system. These are: - Auto source detection (so the default configuration or an `@source "./my-dir"`) - Custom sources ( e.g. `@source "./**/*.bin"` — these contain file extensions) - The code to detect updates on the file system Because of the different heuristics, we were able to construct failing cases (e.g. when you create a new file into `my-dir` that would be thrown out by auto-source detection, it'd would actually be scanned). We were also leaving a lot of performance on the table as the file system is traversed multiple times for certain problems. To resolve these issues, we're now unifying all of these systems into one `ignore` crate walker setup. We also implemented features like auto-source-detection and the `not` flag as additional _gitignore_ rules only, avoid the need for a lot of custom code needed to make decisions. High level, this is what happens after the now: - We collect all non-negative `@source` rules into a list of _roots_ (that is the source directory for this rule) and optional _globs_ (that is the actual rules for files in this file). For custom sources (i.e with a custom `glob`), we add an allowlist rule to the gitignore setup, so that we can be sure these files are always included. - For every negative `@source` rule, we create respective ignore rules. - Furthermore we have a custom filter that ensures files are only read if they have been changed since the last time they were read. So, consider the following setup: ```css /* packages/web/src/index.css */ @import "tailwindcss"; @source "../../lib/ui/**/*.bin"; @source not "../../lib/ui/expensive.bin"; ``` This creates a git ignore file that (simplified) looks like this: ```gitignore # Auto-source rules *.{exe,node,bin,…} *.{css,scss,sass,…} {node_modules,git}/ # Custom sources can overwrite auto-source rules !lib/ui/**/*.bin # Negative rules lib/ui/expensive.bin ``` We then use this information _on top of your existing `.gitignore` setup_ to resolve files (i.e so if your `.gitignore` contains rules e.g. `dist/` this line is going to be added _before_ any of the rules lined out in the example above. This allows negative rules to allow-list your `.gitignore` rules. To implement this, we're rely on the `ignore` crate but we had to make various changes, very specific, to it so we decided to fork the crate. All changes are prefixed with a `// CHANGED:` block but here are the most-important ones: - We added a way to add custom ignore rules that _extend_ (rather than overwrite) your existing `.gitignore` rules - We updated the order in which files are resolved and made it so that more-specific files can allow-list more generic ignore rules. - We resolved various issues related to adding more than one base path to the traversal and ensured it works consistent for Linux, macOS, and Windows. ## Behavioral changes 1. Any custom glob defined via `@source` now wins over your `.gitignore` file and the auto-content rules. - Resolves #16920 3. The `node_modules` and `.git` folders as well as the `.gitignore` file are now ignored by default (but can be overridden by an explicit `@source` rule). - Resolves #17318 - Resolves #15882 4. Source paths into ignored-by-default folders (like `node_modules`) now also win over your `.gitignore` configuration and auto-content rules. - Resolves #16669 5. Introduced `@source not "…"` to negate any previous rules. - Resolves #17058 6. Negative `content` rules in your legacy JavaScript configuration (e.g. `content: ['!./src']`) now work with v4. - Resolves #15943 7. The order of `@source` definitions matter now, because you can technically include or negate previous rules. This is similar to your `.gitingore` file. 9. Rebuilds in watch mode now take the `@source` configuration into account - Resolves #15684 ## Combining with other features Note that the `not` flag is also already compatible with [`@source inline(…)`](https://github.com/tailwindlabs/tailwindcss/pull/17147) added in an earlier commit: ```css @import "tailwindcss"; @source not inline("container"); ``` ## Test plan - We added a bunch of oxide unit tests to ensure that the right files are scanned - We updated the existing integration tests with new `@source not "…"` specific examples and updated the existing tests to match the subtle behavior changes - We also added a new special tag `[ci-all]` that, when added to the description of a PR, causes the PR to run unit and integration tests on all operating systems. [ci-all] --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-25 15:54:41 +01:00
"ignore 0.4.23 (registry+https://github.com/rust-lang/crates.io-index)",
2024-03-05 14:23:26 +01:00
"walkdir",
]
Add `@source not` support (#17255) This PR adds a new source detection feature: `@source not "…"`. It can be used to exclude files specifically from your source configuration without having to think about creating a rule that matches all but the requested file: ```css @import "tailwindcss"; @source not "../src/my-tailwind-js-plugin.js"; ``` While working on this feature, we noticed that there are multiple places with different heuristics we used to scan the file system. These are: - Auto source detection (so the default configuration or an `@source "./my-dir"`) - Custom sources ( e.g. `@source "./**/*.bin"` — these contain file extensions) - The code to detect updates on the file system Because of the different heuristics, we were able to construct failing cases (e.g. when you create a new file into `my-dir` that would be thrown out by auto-source detection, it'd would actually be scanned). We were also leaving a lot of performance on the table as the file system is traversed multiple times for certain problems. To resolve these issues, we're now unifying all of these systems into one `ignore` crate walker setup. We also implemented features like auto-source-detection and the `not` flag as additional _gitignore_ rules only, avoid the need for a lot of custom code needed to make decisions. High level, this is what happens after the now: - We collect all non-negative `@source` rules into a list of _roots_ (that is the source directory for this rule) and optional _globs_ (that is the actual rules for files in this file). For custom sources (i.e with a custom `glob`), we add an allowlist rule to the gitignore setup, so that we can be sure these files are always included. - For every negative `@source` rule, we create respective ignore rules. - Furthermore we have a custom filter that ensures files are only read if they have been changed since the last time they were read. So, consider the following setup: ```css /* packages/web/src/index.css */ @import "tailwindcss"; @source "../../lib/ui/**/*.bin"; @source not "../../lib/ui/expensive.bin"; ``` This creates a git ignore file that (simplified) looks like this: ```gitignore # Auto-source rules *.{exe,node,bin,…} *.{css,scss,sass,…} {node_modules,git}/ # Custom sources can overwrite auto-source rules !lib/ui/**/*.bin # Negative rules lib/ui/expensive.bin ``` We then use this information _on top of your existing `.gitignore` setup_ to resolve files (i.e so if your `.gitignore` contains rules e.g. `dist/` this line is going to be added _before_ any of the rules lined out in the example above. This allows negative rules to allow-list your `.gitignore` rules. To implement this, we're rely on the `ignore` crate but we had to make various changes, very specific, to it so we decided to fork the crate. All changes are prefixed with a `// CHANGED:` block but here are the most-important ones: - We added a way to add custom ignore rules that _extend_ (rather than overwrite) your existing `.gitignore` rules - We updated the order in which files are resolved and made it so that more-specific files can allow-list more generic ignore rules. - We resolved various issues related to adding more than one base path to the traversal and ensured it works consistent for Linux, macOS, and Windows. ## Behavioral changes 1. Any custom glob defined via `@source` now wins over your `.gitignore` file and the auto-content rules. - Resolves #16920 3. The `node_modules` and `.git` folders as well as the `.gitignore` file are now ignored by default (but can be overridden by an explicit `@source` rule). - Resolves #17318 - Resolves #15882 4. Source paths into ignored-by-default folders (like `node_modules`) now also win over your `.gitignore` configuration and auto-content rules. - Resolves #16669 5. Introduced `@source not "…"` to negate any previous rules. - Resolves #17058 6. Negative `content` rules in your legacy JavaScript configuration (e.g. `content: ['!./src']`) now work with v4. - Resolves #15943 7. The order of `@source` definitions matter now, because you can technically include or negate previous rules. This is similar to your `.gitingore` file. 9. Rebuilds in watch mode now take the `@source` configuration into account - Resolves #15684 ## Combining with other features Note that the `not` flag is also already compatible with [`@source inline(…)`](https://github.com/tailwindlabs/tailwindcss/pull/17147) added in an earlier commit: ```css @import "tailwindcss"; @source not inline("container"); ``` ## Test plan - We added a bunch of oxide unit tests to ensure that the right files are scanned - We updated the existing integration tests with new `@source not "…"` specific examples and updated the existing tests to match the subtle behavior changes - We also added a new special tag `[ci-all]` that, when added to the description of a PR, causes the PR to run unit and integration tests on all operating systems. [ci-all] --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-25 15:54:41 +01:00
[[package]]
name = "ignore"
version = "0.4.23"
dependencies = [
"bstr",
"crossbeam-channel",
"crossbeam-deque",
"dunce",
"globset",
"log",
"memchr",
"regex-automata 0.4.8",
"same-file",
"walkdir",
"winapi-util",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "ignore"
version = "0.4.23"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6d89fd380afde86567dfba715db065673989d6253f42b88179abd3eae47bda4b"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-deque",
2024-03-05 14:23:26 +01:00
"globset",
"log",
"memchr",
"regex-automata 0.4.8",
2024-03-05 14:23:26 +01:00
"same-file",
"walkdir",
"winapi-util",
]
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
[[package]]
name = "itertools"
version = "0.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b1c173a5686ce8bfa551b3563d0c2170bf24ca44da99c7ca4bfdab5418c3fe57"
dependencies = [
"either",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "lazy_static"
version = "1.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
[[package]]
name = "libc"
version = "0.2.159"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "561d97a539a36e26a9a5fad1ea11a3039a67714694aaa379433e580854bc3dc5"
2024-03-05 14:23:26 +01:00
[[package]]
name = "libloading"
version = "0.8.5"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4979f22fdb869068da03c9f7528f8297c6fd2606bc3a4affe42e6a823fdb8da4"
2024-03-05 14:23:26 +01:00
dependencies = [
"cfg-if",
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
"windows-targets",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "linux-raw-sys"
version = "0.4.14"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "78b3ae25bc7c8c38cec158d1f2757ee79e9b3740fbc7ccf0e59e4b08d793fa89"
2024-03-05 14:23:26 +01:00
[[package]]
name = "log"
version = "0.4.22"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a7a70ba024b9dc04c27ea2f0c0548feb474ec5c54bba33a7f72f873a39d07b24"
2024-03-05 14:23:26 +01:00
[[package]]
name = "matchers"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8263075bb86c5a1b1427b5ae862e8889656f126e9f77c484496e8b47cf5c5558"
dependencies = [
"regex-automata 0.1.10",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "memchr"
version = "2.7.4"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "78ca9ab1a0babb1e7d5695e3530886289c18cf2f87ec19a575a0abdce112e3a3"
2024-03-05 14:23:26 +01:00
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
[[package]]
name = "minimal-lexical"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "68354c5c6bd36d73ff3feceb05efa59b6acb7626617f4962be322a825e61f79a"
2024-03-05 14:23:26 +01:00
[[package]]
name = "napi"
version = "2.16.17"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "55740c4ae1d8696773c78fdafd5d0e5fe9bc9f1b071c7ba493ba5c413a9184f3"
2024-03-05 14:23:26 +01:00
dependencies = [
"bitflags",
2024-03-05 14:23:26 +01:00
"ctor",
"napi-derive",
"napi-sys",
"once_cell",
]
[[package]]
name = "napi-build"
version = "2.1.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e28acfa557c083f6e254a786e01ba253fc56f18ee000afcd4f79af735f73a6da"
2024-03-05 14:23:26 +01:00
[[package]]
name = "napi-derive"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
version = "2.16.13"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
checksum = "7cbe2585d8ac223f7d34f13701434b9d5f4eb9c332cccce8dee57ea18ab8ab0c"
2024-03-05 14:23:26 +01:00
dependencies = [
"cfg-if",
"convert_case",
"napi-derive-backend",
"proc-macro2",
"quote",
"syn",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "napi-derive-backend"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
version = "1.0.75"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
checksum = "1639aaa9eeb76e91c6ae66da8ce3e89e921cd3885e99ec85f4abacae72fc91bf"
2024-03-05 14:23:26 +01:00
dependencies = [
"convert_case",
"once_cell",
"proc-macro2",
"quote",
"regex",
"semver",
"syn",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "napi-sys"
version = "2.4.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "427802e8ec3a734331fec1035594a210ce1ff4dc5bc1950530920ab717964ea3"
2024-03-05 14:23:26 +01:00
dependencies = [
"libloading",
]
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
[[package]]
name = "nom"
version = "7.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d273983c5a657a70a3e8f2a01329822f3b8c8172b73826411a55751e404a0a4a"
dependencies = [
"memchr",
"minimal-lexical",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "nu-ansi-term"
version = "0.46.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77a8165726e8236064dbb45459242600304b42a5ea24ee2948e18e023bf7ba84"
dependencies = [
"overload",
"winapi",
]
[[package]]
name = "once_cell"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
version = "1.19.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92"
2024-03-05 14:23:26 +01:00
[[package]]
name = "overload"
version = "0.1.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b15813163c1d831bf4a13c3610c05c0d03b39feb07f7e09fa234dac9b15aaf39"
[[package]]
name = "pin-project-lite"
version = "0.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e0a7ae3ac2f1173085d398531c705756c94a4c56843785df85a60c1a0afac116"
[[package]]
name = "pretty_assertions"
version = "1.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3ae130e2f271fbc2ac3a40fb1d07180839cdbbe443c7a27e1e3c13c5cac0116d"
dependencies = [
"diff",
"yansi",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "proc-macro2"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
version = "1.0.86"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
checksum = "5e719e8df665df0d1c8fbfd238015744736151d4445ec0836b8e628aae103b77"
2024-03-05 14:23:26 +01:00
dependencies = [
"unicode-ident",
]
[[package]]
name = "quote"
version = "1.0.28"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b9ab9c7eadfd8df19006f1cf1a4aed13540ed5cbc047010ece5826e10825488"
dependencies = [
"proc-macro2",
]
[[package]]
name = "rayon"
version = "1.10.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b418a60154510ca1a002a752ca9714984e21e4241e804d32555251faf8b78ffa"
2024-03-05 14:23:26 +01:00
dependencies = [
"either",
"rayon-core",
]
[[package]]
name = "rayon-core"
version = "1.12.1"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1465873a3dfdaa8ae7cb14b4383657caab0b3e8a0aa9ae8e04b044854c8dfce2"
2024-03-05 14:23:26 +01:00
dependencies = [
"crossbeam-deque",
"crossbeam-utils",
]
[[package]]
name = "regex"
version = "1.11.1"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b544ef1b4eac5dc2db33ea63606ae9ffcfac26c1416a2806ae0bf5f56b201191"
2024-03-05 14:23:26 +01:00
dependencies = [
"aho-corasick",
2024-03-05 14:23:26 +01:00
"memchr",
"regex-automata 0.4.8",
"regex-syntax 0.8.5",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "regex-automata"
version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6c230d73fb8d8c1b9c0b3135c5142a8acee3a0558fb8db5cf1cb65f8d7862132"
dependencies = [
"regex-syntax 0.6.29",
]
[[package]]
name = "regex-automata"
version = "0.4.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "368758f23274712b504848e9d5a6f010445cc8b87a7cdb4d7cbee666c1288da3"
dependencies = [
"aho-corasick",
"memchr",
"regex-syntax 0.8.5",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "regex-syntax"
version = "0.6.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f162c6dd7b008981e4d40210aca20b4bd0f9b60ca9271061b07f78537722f2e1"
[[package]]
name = "regex-syntax"
version = "0.8.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2b15c43186be67a4fd63bee50d0303afffcef381492ebe2c5d87f324e1b8815c"
2024-09-26 23:50:31 +02:00
[[package]]
name = "rustc-hash"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
version = "2.1.1"
2024-09-26 23:50:31 +02:00
source = "registry+https://github.com/rust-lang/crates.io-index"
Improve Oxide candidate extractor [0] (#16306) This PR adds a new candidate[^candidate] extractor with 2 major goals in mind: 1. It must be way easier to reason about and maintain. 2. It must have on-par performance or better than the current candidate extractor. ### Problem Candidate extraction is a bit of a wild west in Tailwind CSS and it's a very critical step to make sure that all your classes are picked up correctly to ensure that your website/app looks good. One issue we run into is that Tailwind CSS is used in many different "host" languages and frameworks with their own syntax. It's not only used in HTML but also in JSX/TSX, Vue, Svelte, Angular, Pug, Rust, PHP, Rails, Clojure, .NET, … the list goes on and all of these have different syntaxes. Introducing dedicated parsers for each of these languages would be a huge maintenance burden because there will be new languages and frameworks coming up all the time. The best thing we can do is make assumptions and so far we've done a pretty good job at that. The only certainty we have is that there is at least _some_ structure to the possible Tailwind classes used in a file. E.g.: `abc#def` is definitely not a valid class, `hover:flex` definitely is. In a perfect world we limit the characters that can be used and defined a formal grammar that each candidate must follow, but that's not really an option right now (maybe this is something we can implement in future major versions). The current candidate extractor we have has grown organically over time and required patching things here and there to make it work in various scenarios (and edge cases due to the different languages Tailwind is used in). While there is definitely some structure, we essentially work in 2 phases: 1. Try to extract `0..n` candidates. (This is the hard part) 2. Validate each candidate to make sure they are valid looking classes (by validating against the few rules we have) Another reason the current extractor is hard to reason about is that we need it to be fast and that comes with some trade-offs to readability and maintainability. Unfortunately there will always be a lot of false positives, but if we extract more classes than necessary then that's fine. It's only when we pass the candidates to the core engine that we will know for sure if they are valid or not. (we have some ideas to limit the amount of false positives but that's for another time) ### Solution Since the introduction of Tailwind CSS v4, we re-worked the internals quite a bit and we have a dedicated internal AST structure for candidates. For example, if you take a look at this: ```html <div class="[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)"></div> ``` <details> <summary>This will be parsed into the following AST:</summary> ```json [ { "kind": "functional", "root": "text", "value": { "kind": "named", "value": "red-500", "fraction": null }, "modifier": { "kind": "arbitrary", "value": "var(--my-opacity)" }, "variants": [ { "kind": "static", "root": "hover" }, { "kind": "functional", "root": "data", "value": { "kind": "arbitrary", "value": "state=pending" }, "modifier": null }, { "kind": "arbitrary", "selector": "@media(pointer:fine)", "relative": false } ], "important": false, "raw": "[@media(pointer:fine)]:data-[state=pending]:hover:text-red-500/(--my-opacity)" } ] ``` </details> We have a lot of information here and we gave these patterns a name internally. You'll see names like `functional`, `static`, `arbitrary`, `modifier`, `variant`, `compound`, ... Some of these patterns will be important for the new candidate extractor as well: | Name | Example | Description | | -------------------------- | ----------------- | --------------------------------------------------------------------------------------------------- | | Static utility (named) | `flex` | A simple utility with no inputs whatsoever | | Functional utility (named) | `bg-red-500` | A utility `bg` with an input that is named `red-500` | | Arbitrary value | `bg-[#0088cc]` | A utility `bg` with an input that is arbitrary, denoted by `[…]` | | Arbitrary variable | `bg-(--my-color)` | A utility `bg` with an input that is arbitrary and has a CSS variable shorthand, denoted by `(--…)` | | Arbitrary property | `[color:red]` | A utility that sets a property to a value on the fly | A similar structure exist for modifiers, where each modifier must start with `/`: | Name | Example | Description | | ------------------ | --------------------------- | ---------------------------------------- | | Named modifier | bg-red-500`/20` | A named modifier | | Arbitrary value | bg-red-500`/[20%]` | An arbitrary value, denoted by `/[…]` | | Arbitrary variable | bg-red-500`/(--my-opacity)` | An arbitrary variable, denoted by `/(…)` | Last but not least, we have variants. They have a very similar pattern but they _must_ end in a `:`. | Name | Example | Description | | ------------------ | --------------------------- | ------------------------------------------------------------------------ | | Named variant | `hover:` | A named variant | | Arbitrary value | `data-[state=pending]:` | An arbitrary value, denoted by `[…]` | | Arbitrary variable | `supports-(--my-variable):` | An arbitrary variable, denoted by `(…)` | | Arbitrary variant | `[@media(pointer:fine)]:` | Similar to arbitrary properties, this will generate a variant on the fly | The goal with the new extractor is to encode these separate patterns in dedicated pieces of code (we called them "machines" because they are mostly state machine based and because I've been watching Person of Interest but I digress). This will allow us to focus on each pattern separately, so if there is a bug or some new syntax we want to support we can add it to those machines. One nice benefit of this is that we can encode the rules and handle validation as we go. The moment we know that some pattern is invalid, we can bail out early. At the time of writing this, there are a bunch of machines: <details> <summary>Overview of the machines</summary> - `ArbitraryPropertyMachine` Extracts candidates such as `[color:red]`. Some of the rules are: 1. There must be a property name 2. There must be a `:` 3. There must ba a value There cannot be any spaces, the brackets are included, if the property is a CSS variable, it must be a valid CSS variable (uses the `CssVariableMachine`). ``` [color:red] ^^^^^^^^^^^ [--my-color:red] ^^^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `ArbitraryValueMachine` Extracts arbitrary values for utilities and modifiers including the brackets: ``` bg-[#0088cc] ^^^^^^^^^ bg-red-500/[20%] ^^^^^ ``` Depends on the `StringMachine`. - `ArbitraryVariableMachine` Extracts arbitrary variables including the parentheses. The first argument must be a valid CSS variable, the other arguments are optional fallback arguments. ``` (--my-value) ^^^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^ ``` Depends on the `StringMachine` and `CssVariableMachine`. - `CandidateMachine` Uses the variant machine and utility machine. It will make sure that 0 or more variants are directly touching and followed by a utility. ``` hover:focus:flex ^^^^^^^^^^^^^^^^ aria-invalid:bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `VariantMachine` and `UtilityMachine`. - `CssVariableMachine` Extracts CSS variables, they must start with `--` and must contain at least one alphanumeric character or, `-`, `_` and can contain any escaped character (except for whitespace). ``` bg-(--my-color) ^^^^^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^ bg-(--my-color)/(--my-opacity) ^^^^^^^^^^ ^^^^^^^^^^^^ ``` - `ModifierMachine` Extracts modifiers including the `/` - `/[` will delegate to the `ArbitraryValueMachine` - `/(` will delegate to the `ArbitraryVariableMachine` ``` bg-red-500/20 ^^^ bg-red-500/[20%] ^^^^^^ bg-red-500/(--my-opacity) ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedUtilityMachine` Extracts named utilities regardless of whether they are functional or static. ``` flex ^^^^ px-2.5 ^^^^^^ ``` This includes rules like: A `.` must be surrounded by digits. Depends on the `ArbitraryValueMachine` and `ArbitraryVariableMachine`. - `NamedVariantMachine` Extracts named variants regardless of whether they are functional or static. This is very similar to the `NamedUtilityMachine` but with different rules. We could combine them, but splitting things up makes it easier to reason about. Another rule is that the `:` must be included. ``` hover:flex ^^^^^^ data-[state=pending]:flex ^^^^^^^^^^^^^^^^^^^^^ supports-(--my-variable):flex ^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryVariableMachine`, `ArbitraryValueMachine`, and `ModifierMachine`. - `StringMachine` This is a low-level machine that is used by various other machines. The only job this has is to extract strings that start with double quotes, single quotes or backticks. We have this because once you are in a string, we don't have to make sure that brackets, parens and curlies are properly balanced. We have to make sure that balancing brackets are properly handled in other machines. ``` content-["Hello_World!"] ^^^^^^^^^^^^^^ bg-[url("https://example.com")] ^^^^^^^^^^^^^^^^^^^^^ ``` - `UtilityMachine` Extracts utilities, it will use the lower level `NamedUtilityMachine`, `ArbitraryPropertyMachine` and `ModifierMachine` to extract the utility. It will also handle important markers (including the legacy important marker). ``` flex ^^^^ bg-red-500/20 ^^^^^^^^^^^^^ !bg-red-500/20 Legacy important marker ^^^^^^^^^^^^^^ bg-red-500/20! New important marker ^^^^^^^^^^^^^^ !bg-red-500/20! Both, but this is considered invalid ^^^^^^^^^^^^^^^ ``` Depends on the `ArbitraryPropertyMachine`, `NamedUtilityMachine`, and `ModifierMachine`. - `VariantMachine` Extracts variants, it will use the lower level `NamedVariantMachine` and `ArbitraryValueMachine` to extract the variant. ``` hover:focus:flex ^^^^^^ ^^^^^^ ``` Depends on the `NamedVariantMachine` and `ArbitraryValueMachine`. </details> One important thing to know here is that each machine runs to completion. They all implement a `Machine` trait that has a `next(cursor)` method and returns a `MachineState`. The `MachineState` looks like this: ```rs enum MachineState { Idle, Done(Span) } ``` Where a `Span` is just the location in the input where the candidate was found. ```rs struct Span { pub start: usize, pub end: usize, } ``` #### Complexities **Boundary characters:** When running these machines to completion, they don't typically check for boundary characters, the wrapping `CandidateMachine` will check for boundary characters. A boundary character is where we know that even though the character is touching the candidate it will not be part of the candidate. ```html <div class="flex"></div> <!-- ^ ^ --> ``` The quotes are touching the candidate `flex`, but they will not be part of the candidate itself, so this is considered a valid candidate. **What to pick?** Let's imagine you are parsing this input: ```html <div class="hover:flex"></div> ``` The `UtilityMachine` will find `hover` and `flex`. The `VariantMachine` will find `hover:`. This means that at a certain point in the `CandidateMachine` you will see something like this: ```rs let variant_machine_state = variant_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 17 }) // `hover:` let utility_machine_state = utility_machine.next(cursor); // MachineState::Done(Span { start: 12, end: 16 }) // `hover` ``` They are both done, but which one do we pick? In this scenario we will always pick the variant because its range will always be 1 character longer than the utility. Of course there is an exception to this rule and it has to do with the fact that Tailwind CSS can be used in different languages and frameworks. A lot of people use `clsx` for dynamically applying classes to their React components. E.g.: ```tsx <div class={clsx({ underline: someCondition(), })} ></div> ``` In this scenario, we will see `underline:` as a variant, and `underline` as a utility. We will pick the utility in this scenario because the next character is whitespace so this will never be a valid candidate otherwise (variants and utilities must be touching). Another reason this is valid, is because there wasn't a variant present prior to this candidate. E.g.: ```tsx <div class={clsx({ hover:underline: someCondition(), })} ></div> ``` This will be considered invalid, if you do want this, you should use quotes. E.g.: ```tsx <div class={clsx({ 'hover:underline': someCondition(), })} ></div> ``` **Overlapping/covered spans:** Another complexity is that the extracted spans for candidates can and will overlap. Let's take a look at this C# example: ```csharp public enum StackSpacing { [CssClass("gap-y-4")] Small, [CssClass("gap-y-6")] Medium, [CssClass("gap-y-8")] Large } ``` In this scenario, `[CssClass("gap-y-4")]` starts with a `[` so we have a few options here: 1. It is an arbitrary property, e.g.: `[color:red]` 2. It is an arbitrary variant, e.g.: `[@media(pointer:fine)]:` When running the parsers, both the `VariantMachine` and the `UtilityMachine` will run to completion but end up in a `MachineState::Idle` state. - This is because it is not a valid variant because it didn't end with a `:`. - It's also not a valid arbitrary property, because it didn't include a `:` to separate the property from the value. Looking at the code as a human it's very clear what this is supposed to be, but not from the individual machines perspective. Obviously we want to extract the `gap-y-*` classes here. To solve this problem, we will run over an additional slice of the input, starting at the position before the machines started parsing until the position where the machines stopped parsing. That slice will be this one: `[CssClass("gap-y-6")]` (we already skipped over the whitespace). Now, for every `[` character we see, will start a new `CandidateMachine` right after the `[`'s position and run the machines over that slice. This will now eventually extract the `gap-y-6` class. The next question is, what if there was a `:` (e.g.: `[CssClass("gap-y-6")]:`), then the `VariantMachine` would complete, but the `UtilityMachine` will not because not exists after it. We will apply the same idea in this case. Another issue is if we _do_ have actual overlapping ranges. E.g.: `let classes = ['[color:red]'];`. This will extract both the `[color:red]` and `color:red` classes. You have to use your imagination, but the last one has the exact same structure as `hover:flex` (variant + utility). In this case we will make sure to drop spans that are covered by other spans. The extracted `Span`s will be valid candidates therefore if the outer most candidate is valid, we can throw away the inner candidate. ``` Position: 11112222222 67890123456 ↓↓↓↓↓↓↓↓↓↓↓ Span { start: 17, end: 25 } // color:red Span { start: 16, end: 26 } // [color:red] ``` #### Exceptions **JavaScript keys as candidates:** We already talked about the `clsx` scenario, but there are a few more exceptions and that has to do with different syntaxes. **CSS class shorthand in certain templating languages:** In Pug and Slim, you can have a syntax like this: ```pug .flex.underline div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="flex underline"> <div>Hello World</div> </div> ``` </details> We have to make sure that in these scenarios the `.` is a valid boundary character. For this, we introduce a pre-processing step to massage the input a little bit to improve the extraction of the data. We have to make sure we don't make the input smaller or longer otherwise the positions might be off. In this scenario, we could simply replace the `.` with a space. But of course, there are scenarios in these languages where it's not safe to do that. If you want to use `px-2.5` with this syntax, then you'd write: ```pug .flex.px-2.5 div Hello World ``` But that's invalid because that technically means `flex`, `px-2`, and `5` as classes. You can use this syntax to get around that: ```pug div(class="px-2.5") div Hello World ``` <details> <summary>Generated HTML</summary> ```html <div class="px-2.5"> <div>Hello World</div> </div> ``` </details> Which means that we can't simply replace `.` with a space, but have to parse the input. Luckily we only care about strings (and we have a `StringMachine` for that) and ignore replacing `.` inside of strings. **Ruby's weird string syntax:** ```ruby %w[flex underline] ``` This is valid syntax and is shorthand for: ```ruby ["flex", "underline"] ``` Luckily this problem is solved by the running the sub-machines after each `[` character. ### Performance **Testing:** Each machine has a `test_…_performance` test (that is ignored by default) that allows you to test the throughput of that machine. If you want to run them, you can use the following command: ```sh cargo test test_variant_machine_performance --release -- --ignored ``` This will run the test in release mode and allows you to run the ignored test. > [!CAUTION] > This test **_will_** fail, but it will print some output. E.g.: ``` tailwindcss_oxide::extractor::variant_machine::VariantMachine: Throughput: 737.75 MB/s over 0.02s tailwindcss_oxide::extractor::variant_machine::VariantMachine: Duration: 500ns ``` **Readability:** One thing to note when looking at the code is that it's not always written in the cleanest way but we had to make some sacrifices for performance reasons. The `input` is of type `&[u8]`, so we are already dealing with bytes. Luckily, Rust has some nice ergonomics to easily write `b'['` instead of `0x5b`. A concrete example where we had to sacrifice readability is the state machines where we check the `previous`, `current` and `next` character to make decisions. For a named utility one of the rules is that a `.` must be preceded by and followed by a digit. This can be written as: ```rs match (cursor.prev, cursor.curr, cursor.next) { (b'0'..=b'9', b'.', b'0'..=b'9') => { /* … */ } _ => { /* … */ } } ``` But this is not very fast because Rust can't optimize the match statement very well, especially because we are dealing with tuples containing 3 values and each value is a `u8`. To solve this we use some nesting, once we reach `b'.'` only then will we check for the previous and next characters. We will also early return in most places. If the previous character is not a digit, there is no need to check the next character. **Classification and jump tables:** Another optimization we did is to classify the characters into a much smaller `enum` such that Rust _can_ optimize all `match` arms and create some jump tables behind the scenes. E.g.: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { /// ', ", or ` Quote, /// \ Escape, /// Whitespace characters Whitespace, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } set!(Class::Quote, b'"', b'\'', b'`'); set!(Class::Escape, b'\\'); set!(Class::Whitespace, b' ', b'\t', b'\n', b'\r', b'\x0C'); table }; ``` There are only 4 values in this enum, so Rust can optimize this very well. The `CLASS_TABLE` is generated at compile time and must be exactly 256 elements long to fit all `u8` values. **Inlining**: Last but not least, sometimes we use functions to abstract some logic. Luckily Rust will optimize and inline most of the functions automatically. In some scenarios, explicitly adding a `#[inline(always)]` improves performance, sometimes it doesn't improve it at all. You might notice that in some functions the annotation is added and in some it's not. Every state machine was tested on its own and whenever the performance was better with the annotation, it was added. ### Test Plan 1. Each machine has a dedicated set of tests to try and extract the relevant part for that machine. Most machines don't even check boundary characters or try to extract nested candidates. So keep that in mind when adding new tests. Extracting inside of nested `[…]` is only handled by the outer most `extractor/mod.rs`. 2. The main `extractor/mod.rs` has dedicated tests for recent bug reports related to missing candidates. 3. You can test each machine's performance if you want to. There is a chance that this new parser is missing candidates even though a lot of tests are added and existing tests have been ported. To double check, we ran the new extractor on our own projects to make sure we didn't miss anything obvious. #### Tailwind UI On Tailwind UI the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index d83b0a506..b3dd94a1d 100644 --- a/./main.css +++ b/./pr.css @@ -5576,9 +5576,6 @@ @layer utilities { --tw-saturate: saturate(0%); filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } - .\!filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,) !important; - } .filter { filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); } ``` </details> The reason `!filter` is gone, is because it was used like this: ```js getProducts.js 23: if (!filter) return true ``` And right now `(` and `)` are not considered valid boundary characters for a candidate. #### Catalyst On Catalyst, the diff looks like this: <details> <summary>diff</summary> ```diff diff --git a/./main.css b/./pr.css index 9f8ed129..4aec992e 100644 --- a/./main.css +++ b/./pr.css @@ -2105,9 +2105,6 @@ .outline-transparent { outline-color: transparent; } - .filter { - filter: var(--tw-blur,) var(--tw-brightness,) var(--tw-contrast,) var(--tw-grayscale,) var(--tw-hue-rotate,) var(--tw-invert,) var(--tw-saturate,) var(--tw-sepia,) var(--tw-drop-shadow,); - } .backdrop-blur-\[6px\] { --tw-backdrop-blur: blur(6px); -webkit-backdrop-filter: var(--tw-backdrop-blur,) var(--tw-backdrop-brightness,) var(--tw-backdrop-contrast,) var(--tw-backdrop-grayscale,) var(--tw-backdrop-hue-rotate,) var(--tw-backdrop-invert,) var(--tw-backdrop-opacity,) var(--tw-backdrop-saturate,) var(--tw-backdrop-sepia,); @@ -7141,46 +7138,6 @@ inherits: false; initial-value: solid; } -@property --tw-blur { - syntax: "*"; - inherits: false; -} -@property --tw-brightness { - syntax: "*"; - inherits: false; -} -@property --tw-contrast { - syntax: "*"; - inherits: false; -} -@property --tw-grayscale { - syntax: "*"; - inherits: false; -} -@property --tw-hue-rotate { - syntax: "*"; - inherits: false; -} -@property --tw-invert { - syntax: "*"; - inherits: false; -} -@property --tw-opacity { - syntax: "*"; - inherits: false; -} -@property --tw-saturate { - syntax: "*"; - inherits: false; -} -@property --tw-sepia { - syntax: "*"; - inherits: false; -} -@property --tw-drop-shadow { - syntax: "*"; - inherits: false; -} @property --tw-backdrop-blur { syntax: "*"; inherits: false; ``` </details> The reason for this is that `filter` was only used as a function call: ```tsx src/app/docs/Code.tsx 31: .filter((x) => x !== null) ``` This was tested on all templates and they all remove a very small amount of classes that aren't used. The script to test this looks like this: ```sh bun --bun ~/github.com/tailwindlabs/tailwindcss/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o pr.css bun --bun ~/github.com/tailwindlabs/tailwindcss--main/packages/@tailwindcss-cli/src/index.t -- -i ./src/styles/tailwind.css -o main.css git diff --no-index --patch ./{main,pr}.css ``` This is using git worktrees, so the `pr` branch lives in a `tailwindcss` folder, and the `main` branch lives in a `tailwindcss--main` folder. --- ### Fixes: - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/15616 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16750 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16790 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16801 - Fixes: https://github.com/tailwindlabs/tailwindcss/issues/16880 (due to validating the arbitrary property) --- ### Ideas for in the future 1. Right now each machine takes in a `Cursor` object. One potential improvement we can make is to rely on the `input` on its own instead of going via the wrapping `Cursor` object. 2. If you take a look at the AST, you'll notice that utilities and variants have a "root", these are basically prefixes of each available utility and/or variant. We can use this information to filter out candidates and bail out early if we know that a certain candidate will never produce a valid class. 3. Passthrough the `prefix` information. Everything that doesn't start with `tw:` can be skipped. ### Design decisions that didn't make it Once you reach this part, you can stop reading if you want to, but this is more like a brain dump of the things we tried and didn't work out. Wanted to include them as a reference in case we want to look back at this issue and know _why_ certain things are implemented the way they are. #### One character at a time In an earlier implementation, the state machines were pure state machines where the `next()` function was called on every single character of the input. This had a lot of overhead because for every character we had to: 1. Ask the `CandidateMachine` which state it was in. 2. Check the `cursor.curr` (and potentially the `cursor.prev` and `cursor.next`) character. 3. If we were in a state where a nested state machine was running, we had to check its current state as well and so on. 4. Once we did all of that we could go to the next character. In this approach, the `MachineState` looked like this instead: ```rs enum MachineState { Idle, Parsing, Done(Span) } ``` This had its own set of problems because now it's very hard to know whether we are done or not. ```html <div class="hover:flex"></div> <!-- ^ --> ``` Let's look at the current position in the example above. At this point, it's both a valid variant and valid utility, so there was a lot of additional state we had to track to know whether we were done or not. #### `Span` stitching Another approach we tried was to just collect all valid variants and utilities and throw them in a big `Vec<Span>`. This reduced the amount of additional state to track and we could track a span the moment we saw a `MachineState::Done(span)`. The next thing we had to do was to make sure that: 1. Covered spans were removed. We still do this part in the current implementation. 2. Combine all touching variant spans (where `span_a.end + 1 == span_b.start`). 3. For every combined variant span, find a corresponding utility span. - If there is no utility span, the candidate is invalid. - If there are multiple candidate spans (this is in theory not possible because we dropped covered spans) - If there is a candidate _but_ it is attached to another set of spans, then the candidate is invalid. E.g.: `flex!block` 4. All left-over utility spans are candidates without variants. This approach was slow, and still a bit hard to reason about. #### Matching on tuples While matching against the `prev`, `curr` and `next` characters was very readable and easy to reason about. It was not very fast. Unfortunately had to abandon this approach in favor of a more optimized approach. In a perfect world, we would still write it this way, but have some compile time macro that would optimize this for us. #### Matching against `b'…'` instead of classification and jump tables Similar to the previous point, while this is better for readability, it's not fast enough. The jump tables are much faster. Luckily for us, each machine has it's own set of rules and context, so it's much easier to reason about a single problem and optimize a single machine. [^candidate]: A candidate is what a potential Tailwind CSS class _could_ be. It's a candidate because at this stage we don't know if it will actually produce something but it looks like it could be a valid class. E.g.: `hover:bg-red-500` is a candidate, but it will only produce something if `--color-red-500` is defined in your theme. --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-05 11:55:24 +01:00
checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d"
2024-09-26 23:50:31 +02:00
2024-03-05 14:23:26 +01:00
[[package]]
name = "rustix"
version = "0.38.37"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8acb788b847c24f28525660c4d7758620a7210875711f79e7f663cc152726811"
2024-03-05 14:23:26 +01:00
dependencies = [
"bitflags",
2024-03-05 14:23:26 +01:00
"errno",
"libc",
"linux-raw-sys",
"windows-sys 0.52.0",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "same-file"
version = "1.0.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "93fc1dc3aaa9bfed95e02e6eadabb4baf7e3078b0bd1b4d7b6b0b68378900502"
dependencies = [
"winapi-util",
]
[[package]]
name = "semver"
version = "1.0.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bebd363326d05ec3e2f532ab7660680f3b02130d780c299bca73469d521bc0ed"
[[package]]
name = "serde"
version = "1.0.163"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2113ab51b87a539ae008b5c6c02dc020ffa39afd2d83cffcb3f4eb2722cebec2"
[[package]]
name = "sharded-slab"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "900fba806f70c630b0a382d0d825e17a0f19fcd059a2ade1ff237bcddf446b31"
dependencies = [
"lazy_static",
]
[[package]]
name = "smallvec"
version = "1.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a507befe795404456341dfab10cef66ead4c041f62b8b11bbb92bffe5d0953e0"
[[package]]
name = "syn"
version = "2.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32d41677bcbe24c20c52e7c70b0d8db04134c5d1066bf98662e2871ad200ea3e"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "tailwind-oxide"
version = "0.0.0"
dependencies = [
"napi",
"napi-build",
"napi-derive",
"rayon",
"tailwindcss-oxide",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "tailwindcss-oxide"
2024-03-05 14:23:26 +01:00
version = "0.1.0"
dependencies = [
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
"bexpand",
2024-03-05 14:23:26 +01:00
"bstr",
Improve internal DX around byte classification [1] (#16864) This PR improves the internal DX when working with `u8` classification into a smaller enum. This is done by implementing a `ClassifyBytes` proc derive macro. The benefit of this is that the DX is much better and everything you will see here is done at compile time. Before: ```rs #[derive(Debug, Clone, Copy, PartialEq)] enum Class { ValidStart, ValidInside, OpenBracket, OpenParen, Slash, Other, } const CLASS_TABLE: [Class; 256] = { let mut table = [Class::Other; 256]; macro_rules! set { ($class:expr, $($byte:expr),+ $(,)?) => { $(table[$byte as usize] = $class;)+ }; } macro_rules! set_range { ($class:expr, $start:literal ..= $end:literal) => { let mut i = $start; while i <= $end { table[i as usize] = $class; i += 1; } }; } set_range!(Class::ValidStart, b'a'..=b'z'); set_range!(Class::ValidStart, b'A'..=b'Z'); set_range!(Class::ValidStart, b'0'..=b'9'); set!(Class::OpenBracket, b'['); set!(Class::OpenParen, b'('); set!(Class::Slash, b'/'); set!(Class::ValidInside, b'-', b'_', b'.'); table }; ``` After: ```rs #[derive(Debug, Clone, Copy, PartialEq, ClassifyBytes)] enum Class { #[bytes_range(b'a'..=b'z', b'A'..=b'Z', b'0'..=b'9')] ValidStart, #[bytes(b'-', b'_', b'.')] ValidInside, #[bytes(b'[')] OpenBracket, #[bytes(b'(')] OpenParen, #[bytes(b'/')] Slash, #[fallback] Other, } ``` Before we were generating a `CLASS_TABLE` that we could access directly, but now it will be part of the `Class`. This means that the usage has to change: ```diff - CLASS_TABLE[cursor.curr as usize] + Class::TABLE[cursor.curr as usize] ``` This is slightly worse UX, and this is where another change comes in. We implemented the `From<u8> for #enum_name` trait inside of the `ClassifyBytes` derive macro. This allows us to use `.into()` on any `u8` as long as we are comparing it to a `Class` instance. In our scenario: ```diff - Class::TABLE[cursor.curr as usize] + cursor.curr.into() ``` Usage wise, this looks something like this: ```diff while cursor.pos < len { - match Class::TABLE[cursor.curr as usize] { + match cursor.curr.into() { - Class::Escape => match Class::Table[cursor.next as usize] { + Class::Escape => match cursor.next.into() { // An escaped whitespace character is not allowed Class::Whitespace => return MachineState::Idle, // An escaped character, skip ahead to the next character _ => cursor.advance(), }, // End of the string Class::Quote if cursor.curr == end_char => return self.done(start_pos, cursor), // Any kind of whitespace is not allowed Class::Whitespace => return MachineState::Idle, // Everything else is valid _ => {} }; cursor.advance() } MachineState::Idle } } ``` If you manually look at the `Class::TABLE` in your editor for example, you can see that it is properly generated at compile time. Given this input: ```rs #[derive(Clone, Copy, ClassifyBytes)] enum Class { #[bytes_range(b'a'..=b'z')] AlphaLower, #[bytes_range(b'A'..=b'Z')] AlphaUpper, #[bytes(b'@')] At, #[bytes(b':')] Colon, #[bytes(b'-')] Dash, #[bytes(b'.')] Dot, #[bytes(b'\0')] End, #[bytes(b'!')] Exclamation, #[bytes_range(b'0'..=b'9')] Number, #[bytes(b'[')] OpenBracket, #[bytes(b']')] CloseBracket, #[bytes(b'(')] OpenParen, #[bytes(b'%')] Percent, #[bytes(b'"', b'\'', b'`')] Quote, #[bytes(b'/')] Slash, #[bytes(b'_')] Underscore, #[bytes(b' ', b'\t', b'\n', b'\r', b'\x0C')] Whitespace, #[fallback] Other, } ``` This is the result: <img width="1244" alt="image" src="https://github.com/user-attachments/assets/6ffd6ad3-0b2f-4381-a24c-593e4c72080e" />
2025-03-05 14:00:07 +01:00
"classification-macros",
2024-03-05 14:23:26 +01:00
"crossbeam",
Add `@source` support (#14078) This PR is an umbrella PR where we will add support for the new `@source` directive. This will allow you to add explicit content glob patterns if you want to look for Tailwind classes in other files that are not automatically detected yet. Right now this is an addition to the existing auto content detection that is automatically enabled in the `@tailwindcss/postcss` and `@tailwindcss/cli` packages. The `@tailwindcss/vite` package doesn't use the auto content detection, but uses the module graph instead. From an API perspective there is not a lot going on. There are only a few things that you have to know when using the `@source` directive, and you probably already know the rules: 1. You can use multiple `@source` directives if you want. 2. The `@source` accepts a glob pattern so that you can match multiple files at once 3. The pattern is relative to the current file you are in 4. The pattern includes all files it is matching, even git ignored files 1. The motivation for this is so that you can explicitly point to a `node_modules` folder if you want to look at `node_modules` for whatever reason. 6. Right now we don't support negative globs (starting with a `!`) yet, that will be available in the near future. Usage example: ```css /* ./src/input.css */ @import "tailwindcss"; @source "../laravel/resources/views/**/*.blade.php"; @source "../../packages/monorepo-package/**/*.js"; ``` It looks like the PR introduced a lot of changes, but this is a side effect of all the other plumbing work we had to do to make this work. For example: 1. We added dedicated integration tests that run on Linux and Windows in CI (just to make sure that all the `path` logic is correct) 2. We Have to make sure that the glob patterns are always correct even if you are using `@import` in your CSS and use `@source` in an imported file. This is because we receive the flattened CSS contents where all `@import`s are inlined. 3. We have to make sure that we also listen for changes in the files that match any of these patterns and trigger a rebuild. PRs: - [x] https://github.com/tailwindlabs/tailwindcss/pull/14063 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14085 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14079 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14067 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14076 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14080 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14127 - [x] https://github.com/tailwindlabs/tailwindcss/pull/14135 Once all the PRs are merged, then this umbrella PR can be merged. > [!IMPORTANT] > Make sure to merge this without rebasing such that each individual PR ends up on the main branch. --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com> Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <adam.wathan@gmail.com>
2024-08-07 16:38:44 +02:00
"dunce",
"fast-glob",
2024-03-05 14:23:26 +01:00
"globwalk",
Add `@source not` support (#17255) This PR adds a new source detection feature: `@source not "…"`. It can be used to exclude files specifically from your source configuration without having to think about creating a rule that matches all but the requested file: ```css @import "tailwindcss"; @source not "../src/my-tailwind-js-plugin.js"; ``` While working on this feature, we noticed that there are multiple places with different heuristics we used to scan the file system. These are: - Auto source detection (so the default configuration or an `@source "./my-dir"`) - Custom sources ( e.g. `@source "./**/*.bin"` — these contain file extensions) - The code to detect updates on the file system Because of the different heuristics, we were able to construct failing cases (e.g. when you create a new file into `my-dir` that would be thrown out by auto-source detection, it'd would actually be scanned). We were also leaving a lot of performance on the table as the file system is traversed multiple times for certain problems. To resolve these issues, we're now unifying all of these systems into one `ignore` crate walker setup. We also implemented features like auto-source-detection and the `not` flag as additional _gitignore_ rules only, avoid the need for a lot of custom code needed to make decisions. High level, this is what happens after the now: - We collect all non-negative `@source` rules into a list of _roots_ (that is the source directory for this rule) and optional _globs_ (that is the actual rules for files in this file). For custom sources (i.e with a custom `glob`), we add an allowlist rule to the gitignore setup, so that we can be sure these files are always included. - For every negative `@source` rule, we create respective ignore rules. - Furthermore we have a custom filter that ensures files are only read if they have been changed since the last time they were read. So, consider the following setup: ```css /* packages/web/src/index.css */ @import "tailwindcss"; @source "../../lib/ui/**/*.bin"; @source not "../../lib/ui/expensive.bin"; ``` This creates a git ignore file that (simplified) looks like this: ```gitignore # Auto-source rules *.{exe,node,bin,…} *.{css,scss,sass,…} {node_modules,git}/ # Custom sources can overwrite auto-source rules !lib/ui/**/*.bin # Negative rules lib/ui/expensive.bin ``` We then use this information _on top of your existing `.gitignore` setup_ to resolve files (i.e so if your `.gitignore` contains rules e.g. `dist/` this line is going to be added _before_ any of the rules lined out in the example above. This allows negative rules to allow-list your `.gitignore` rules. To implement this, we're rely on the `ignore` crate but we had to make various changes, very specific, to it so we decided to fork the crate. All changes are prefixed with a `// CHANGED:` block but here are the most-important ones: - We added a way to add custom ignore rules that _extend_ (rather than overwrite) your existing `.gitignore` rules - We updated the order in which files are resolved and made it so that more-specific files can allow-list more generic ignore rules. - We resolved various issues related to adding more than one base path to the traversal and ensured it works consistent for Linux, macOS, and Windows. ## Behavioral changes 1. Any custom glob defined via `@source` now wins over your `.gitignore` file and the auto-content rules. - Resolves #16920 3. The `node_modules` and `.git` folders as well as the `.gitignore` file are now ignored by default (but can be overridden by an explicit `@source` rule). - Resolves #17318 - Resolves #15882 4. Source paths into ignored-by-default folders (like `node_modules`) now also win over your `.gitignore` configuration and auto-content rules. - Resolves #16669 5. Introduced `@source not "…"` to negate any previous rules. - Resolves #17058 6. Negative `content` rules in your legacy JavaScript configuration (e.g. `content: ['!./src']`) now work with v4. - Resolves #15943 7. The order of `@source` definitions matter now, because you can technically include or negate previous rules. This is similar to your `.gitingore` file. 9. Rebuilds in watch mode now take the `@source` configuration into account - Resolves #15684 ## Combining with other features Note that the `not` flag is also already compatible with [`@source inline(…)`](https://github.com/tailwindlabs/tailwindcss/pull/17147) added in an earlier commit: ```css @import "tailwindcss"; @source not inline("container"); ``` ## Test plan - We added a bunch of oxide unit tests to ensure that the right files are scanned - We updated the existing integration tests with new `@source not "…"` specific examples and updated the existing tests to match the subtle behavior changes - We also added a new special tag `[ci-all]` that, when added to the description of a PR, causes the PR to run unit and integration tests on all operating systems. [ci-all] --------- Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-25 15:54:41 +01:00
"ignore 0.4.23",
2024-03-05 14:23:26 +01:00
"log",
"pretty_assertions",
2024-03-05 14:23:26 +01:00
"rayon",
"regex",
2024-09-26 23:50:31 +02:00
"rustc-hash",
2024-03-05 14:23:26 +01:00
"tempfile",
"tracing",
"tracing-subscriber",
"unicode-width",
2024-03-05 14:23:26 +01:00
"walkdir",
]
[[package]]
name = "tempfile"
version = "3.13.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0f2c9fc62d0beef6951ccffd757e241266a2c833136efbe35af6cd2567dca5b"
2024-03-05 14:23:26 +01:00
dependencies = [
"cfg-if",
"fastrand",
"once_cell",
2024-03-05 14:23:26 +01:00
"rustix",
"windows-sys 0.59.0",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "thread_local"
version = "1.1.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3fdd6f064ccff2d6567adcb3873ca630700f00b5ad3f060c25b5dcfd9a4ce152"
dependencies = [
"cfg-if",
"once_cell",
]
[[package]]
name = "tracing"
version = "0.1.40"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c3523ab5a71916ccf420eebdf5521fcef02141234bbc0b8a49f2fdc4544364ef"
2024-03-05 14:23:26 +01:00
dependencies = [
"pin-project-lite",
"tracing-attributes",
"tracing-core",
]
[[package]]
name = "tracing-attributes"
version = "0.1.27"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "34704c8d6ebcbc939824180af020566b01a7c01f80641264eba0999f6c2b6be7"
2024-03-05 14:23:26 +01:00
dependencies = [
"proc-macro2",
"quote",
"syn",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "tracing-core"
version = "0.1.32"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c06d3da6113f116aaee68e4d601191614c9053067f9ab7f6edbcb161237daa54"
2024-03-05 14:23:26 +01:00
dependencies = [
"once_cell",
"valuable",
]
[[package]]
name = "tracing-log"
version = "0.2.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ee855f1f400bd0e5c02d150ae5de3840039a3f54b025156404e34c23c03f47c3"
2024-03-05 14:23:26 +01:00
dependencies = [
"log",
"once_cell",
2024-03-05 14:23:26 +01:00
"tracing-core",
]
[[package]]
name = "tracing-subscriber"
version = "0.3.18"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ad0f048c97dbd9faa9b7df56362b8ebcaa52adb06b498c050d2f4e32f90a7a8b"
2024-03-05 14:23:26 +01:00
dependencies = [
"matchers",
"nu-ansi-term",
"once_cell",
"regex",
"sharded-slab",
"smallvec",
"thread_local",
"tracing",
"tracing-core",
"tracing-log",
]
[[package]]
name = "unicode-ident"
version = "1.0.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b15811caf2415fb889178633e7724bad2509101cde276048e013b9def5e51fa0"
[[package]]
name = "unicode-segmentation"
version = "1.10.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1dd624098567895118886609431a7c3b8f516e41d30e0643f03d94592a147e36"
[[package]]
name = "unicode-width"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fc81956842c57dac11422a97c3b8195a1ff727f06e85c84ed2e8aa277c9a0fd"
2024-03-05 14:23:26 +01:00
[[package]]
name = "valuable"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "830b7e5d4d90034032940e4ace0d9a9a057e7a45cd94e6c007832e39edb82f6d"
[[package]]
name = "walkdir"
version = "2.5.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "29790946404f91d9c5d06f9874efddea1dc06c5efe94541a7d6863108e3a5e4b"
2024-03-05 14:23:26 +01:00
dependencies = [
"same-file",
"winapi-util",
]
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-util"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "70ec6ce85bb158151cae5e5c87f95a8e97d2c0c4b001223f33a334e3ce5de178"
dependencies = [
"winapi",
]
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
[[package]]
name = "windows-sys"
version = "0.52.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
2024-03-05 14:23:26 +01:00
dependencies = [
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
"windows-targets",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "windows-sys"
version = "0.59.0"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1e38bc4d79ed67fd075bcc251a1c39b32a1776bbe92e5bef1f0bf1f8c531853b"
2024-03-05 14:23:26 +01:00
dependencies = [
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
"windows-targets",
2024-03-05 14:23:26 +01:00
]
[[package]]
name = "windows-targets"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9b724f72796e036ab90c1021d4780d4d3d648aca59e491e6b98e725b84e99973"
dependencies = [
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
"windows_aarch64_gnullvm",
"windows_aarch64_msvc",
"windows_i686_gnu",
"windows_i686_gnullvm",
Auto source detection improvements (#14820) This PR introduces a new `source(…)` argument and improves on the existing `@source`. The goal of this PR is to make the automatic source detection configurable, let's dig in. By default, we will perform automatic source detection starting at the current working directory. Auto source detection will find plain text files (no binaries, images, ...) and will ignore git-ignored files. If you want to start from a different directory, you can use the new `source(…)` next to the `@import "tailwindcss/utilities" layer(utilities) source(…)`. E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss/utilities' layer(utilities) source('../../'); ``` Most people won't split their source files, and will just use the simple `@import "tailwindcss";`, because of this reason, you can use `source(…)` on the import as well: E.g.: ```css /* ./src/styles/index.css */ @import 'tailwindcss' source('../../'); ``` Sometimes, you want to rely on auto source detection, but also want to look in another directory for source files. In this case, yuo can use the `@source` directive: ```css /* ./src/index.css */ @import 'tailwindcss'; /* Look for `blade.php` files in `../resources/views` */ @source '../resources/views/**/*.blade.php'; ``` However, you don't need to specify the extension, instead you can just point the directory and all the same automatic source detection rules will apply. ```css /* ./src/index.css */ @import 'tailwindcss'; @source '../resources/views'; ``` If, for whatever reason, you want to disable the default source detection feature entirely, and only want to rely on very specific glob patterns you define, then you can disable it via `source(none)`. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Only look at .blade.php files, nothing else */ @source "../resources/views/**/*.blade.php"; ``` Note: even with `source(none)`, if your `@source` points to a directory, then auto source detection will still be performed in that directory. If you don't want that, then you can simply add explicit files in the globs as seen in the previous example. ```css /* Completely disable the default auto source detection */ @import 'tailwindcss' source(none); /* Run auto source detection in `../resources/views` */ @source "../resources/views"; ``` --------- Co-authored-by: Jordan Pittman <jordan@cryptica.me> Co-authored-by: Adam Wathan <4323180+adamwathan@users.noreply.github.com>
2024-10-29 21:33:34 +01:00
"windows_i686_msvc",
"windows_x86_64_gnu",
"windows_x86_64_gnullvm",
"windows_x86_64_msvc",
]
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32a4622180e7a0ec044bb555404c800bc9fd9ec262ec147edd5989ccd0c02cd3"
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_aarch64_msvc"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09ec2a7bb152e2252b53fa7803150007879548bc709c039df7627cabbd05d469"
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_i686_gnu"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8e9b5ad5ab802e97eb8e295ac6720e509ee4c243f69d781394014ebfe8bbfa0b"
[[package]]
name = "windows_i686_gnullvm"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0eee52d38c090b3caa76c563b86c3a4bd71ef1a819287c19d586d7334ae8ed66"
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_i686_msvc"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "240948bc05c5e7c6dabba28bf89d89ffce3e303022809e73deaefe4f6ec56c66"
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_x86_64_gnu"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "147a5c80aabfbf0c7d901cb5895d1de30ef2907eb21fbbab29ca94c5b08b1a78"
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.52.6"
2024-03-05 14:23:26 +01:00
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "24d5b23dc417412679681396f2b49f3de8c1473deb516bd34410872eff51ed0d"
2024-03-05 14:23:26 +01:00
[[package]]
name = "windows_x86_64_msvc"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "589f6da84c646204747d1270a2a5661ea66ed1cced2631d546fdfb155959f9ec"
[[package]]
name = "yansi"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cfe53a6657fd280eaa890a3bc59152892ffa3e30101319d168b781ed6529b049"