Community members are increasingly filing blank issues, which don't go through the added labeling and triaging flows we've implemented
Signed-off-by: Joe Cotant <joe@anyscale.com>
- Remove @jjyao to stop spamming him
- Add @israbbani and @MengjinYan for `_common/` and public protos
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
## Summary
This PR adds direct ingress functionality to Ray Serve, allowing
replicas to listen directly on ports for incoming traffic. This enables
deployments where each replica is individually addressable, which is
useful for Kubernetes ingress controllers and load balancers.
### Key changes:
- Add direct ingress HTTP utility for ASGI receive proxying
- Add direct ingress gRPC support (unary-unary handlers, health checks)
- Add replica response generator for disconnect detection
- Add node port manager for per-node port allocation
- Add direct ingress support to controller (target groups, port
allocation)
- Add direct ingress support to replica (HTTP/gRPC servers, graceful
shutdown)
- Add direct ingress constants and feature flag
(`RAY_SERVE_ENABLE_DIRECT_INGRESS`)
- Add integration and unit tests
The feature is disabled by default and can be enabled by setting
`RAY_SERVE_ENABLE_DIRECT_INGRESS=1`.
## Test plan
- [x] Run `test_direct_ingress.py` integration tests
- [x] Run `test_direct_ingress_standalone.py` standalone tests
- [x] Run `test_controller_direct_ingress.py` unit tests
- [x] Benchmarks
## Benchmarks
Validated using release image from
https://buildkite.com/ray-project/release/builds/76725#019be801-2dc9-45f7-bf1e-3d7ae41ff6d8
and `RAY_SERVE_ENABLE_DIRECT_INGRESS=1`, direct ingress provides 6x
reduction in TTFT and 1.8x improvement in throughput. Max concurrency
32/replica, 256 prompts/replica total. ITL 1024 OTL 256.
<img width="4171" height="2955" alt="httpproxy_vs_direct_ingress"
src="https://github.com/user-attachments/assets/2fddd7df-ad8a-4422-920a-d692993f846c"
/>
HTTPProxy (baseline) sweep:
https://gist.github.com/eicherseiji/630a83a0582aad22fb4655e6464a0038
Direct Ingress (this PR) sweep:
https://gist.github.com/eicherseiji/18d326524cbfe48e0353bfa8243bbb28
Toy load balancer used to test Direct Ingress:
https://gist.github.com/eicherseiji/cce80555ac2e8d04482483e5f8422509
---------
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Eugene Kim <eugenek@anyscale.com>
Co-authored-by: Han-Ju Chen (Future-Outlier) <eric901201@gmail.com>
## Summary
- Add CODEOWNERS entry for
`/doc/source/data/doc_code/working-with-llms/` to assign ownership to
the ray-llm team
## Test plan
- N/A (CODEOWNERS change only)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
`ray-docs` has become a bottleneck for review. No longer requiring their
approval for library documentation changes, but leaving it as a
catch-all for other docs changes.
Flyby: removing code ownership for removed Ray Workflows library
directories.
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
## Description
Shortening the template per @edoakes' feedback.
## Related issues
Follow-up to https://github.com/ray-project/ray/pull/57193.
## Additional information
Made the following changes:
1. Removed `Types of change` and `Checklist`.
2. Updated contribution guide to point to Ray Docs.
3. Renamed `Additional context` to `Additional information` to be more
encompassing.
---------
Signed-off-by: Matthew Deng <matthew.j.deng@gmail.com>
<!-- Thank you for contributing to Ray! 🚀 -->
<!-- Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->
<!-- 💡 Tip: Mark as draft if you want early feedback, or ready for
review when it's complete -->
## Description
<!-- Briefly describe what this PR accomplishes and why it's needed -->
Improved the Ray pull request template to make it less overwhelming for
contributors while giving maintainers better information for reviews and
release notes. The new template has clearer sections and organized
checklists that are much easier to fill out. This should encourage more
contributions while making the review process smoother and release note
generation more straightforward.
## Related issues
<!-- Link related issues: "Fixes #1234", "Closes #1234", or "Related to
#1234" -->
## Types of change
- [ ] Bug fix 🐛
- [ ] New feature ✨
- [x] Enhancement 🚀
- [ ] Code refactoring 🔧
- [ ] Documentation update 📖
- [ ] Chore 🧹
- [ ] Style 🎨
## Checklist
**Does this PR introduce breaking changes?**
- [ ] Yes ⚠️
- [x] No
<!-- If yes, describe what breaks and how users should migrate -->
**Testing:**
- [ ] Added/updated tests for my changes
- [x] Tested the changes manually
- [ ] This PR is not tested ❌ _(please explain why)_
**Code Quality:**
- [x] Signed off every commit (`git commit -s`)
- [x] Ran pre-commit hooks ([setup
guide](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting))
**Documentation:**
- [ ] Updated documentation (if applicable) ([contribution
guide](https://docs.ray.io/en/latest/ray-contribute/docs.html))
- [ ] Added new APIs to `doc/source/` (if applicable)
## Additional context
<!-- Optional: Add screenshots, examples, performance impact, breaking
change details -->
---------
Signed-off-by: Matthew Deng <matthew.j.deng@gmail.com>
Signed-off-by: matthewdeng <matthew.j.deng@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This pull request adds a configurable `max_constructor_retry_count` for
deployments, enabling users to define how many times a failing
constructor should be retried.
The value can now be set via both an environment variable and the
deployment config. When both are provided, the environment variable
takes precedence.
GH issue link: https://github.com/ray-project/ray/issues/55786
---------
Signed-off-by: harshit <harshit@anyscale.com>
Co-authored-by: Cindy Zhang <cindyzyx9@gmail.com>
Updating pr template to tell users to run pre-commit instead of
`scripts/format.sh`
---------
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
<!-- Please give a short summary of the change and the problem this
solves. -->
The LLM APIs under `ray.data.llm` are maintained by
@ray-project/ray-llm, not the Data team. This PR updates the
`CODEOWNERS` to reflect that responsibility, assigning ownership to the
LLM team even though the code lives in the Data directory.
## Related issue number
<!-- For example: "Closes #1234" -->
## Checks
- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [ ] Unit tests
- [ ] Release tests
- [ ] This PR is not tested :(
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Add ownership and README for how to modify the proto files in the public
directory. This is related to a recent work to define proto exposure via
directory structure and set expectations for maintainer/users of these
proto.
Test:
- CI
Signed-off-by: Cuong Nguyen <can@anyscale.com>
Add Ray Train maintainers as the CODEOWNERS for `/python/ray/air`
sub-directory.
Without this change, this directory will have Ray Core assigned as the
CODEOWNER due to [this
line](ca2d866a95/.github/CODEOWNERS (L20-L21)).
Signed-off-by: Matthew Deng <matt@anyscale.com>
<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->
<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->
## Why are these changes needed?
<img width="1203" height="356" alt="max_bytes_to_read"
src="https://github.com/user-attachments/assets/55f28ee9-dd96-48fd-a990-93adea3241cc"
/>
<!-- Please give a short summary of the change and the problem this
solves. -->
## Related issue number
<!-- For example: "Closes #1234" -->
## Checks
- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
corresponding `.rst` file.
- [ ] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [ ] Unit tests
- [ ] Release tests
- [ ] This PR is not tested :(
---------
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Balaji Veeramani <bveeramani@berkeley.edu>
the `bounced` label from the stalebot might be
misleading to contributors - we are renaming it to `unstale` for better
clarity.
Signed-off-by: Christina Zhu <christina@anyscale.com>
We've had some stalebot conflicts where PRs are being marked stale
repeatedly. This is so that if a PR is marked "unstale", it will add a
`bounced` label. If a PR has this label, it will never be marked stale
again.
Signed-off-by: Christina Zhu <christina@anyscale.com>
Currently, our stalebot runs once a day at midnight UTC. Occasionally,
we do hit our 500 operation limit, and it takes sometimes up to 24 hours
for a stale label to be removed after activity. This means not all of
our PRs are processed properly.
This change aims to fix that by:
- Running every 12 hours to improve responsiveness.
- Running at XX:15 past the hour to reduce the XX:00 bottleneck
(sometimes it takes a while for the bot to run - not a huge issue but
wanted to make it nicer)
- Have 1000 operations a day on the bot to fully process all the open
PRs.
After this change, it should be fairly robust for the foreseeable
future.
Signed-off-by: Christina Zhu <christina@anyscale.com>
The stale PR job ran successfully yesterday but only managed to process 9 PRs before running out of GitHub API actions. The operations-per-run is currently set to 30 (same as the previous stale bot). However, we still have hundreds of open PRs and at the rate we are going it will take a little while before we finish processing them all.
Since we have about 5000 GitHub API Rate per hour, I've upped our operations-per-run to 500, so we can process more PRs. After it runs, we should still have 4500 left.
Signed-off-by: Christina Zhu <christina@anyscale.com>
Currently, the stale actions still pulls the `issues` even though we
only want it to run on PRs. Therefore, I added stricter permissions for
issues stale timeouts and removed the issues permissions entirely. This
way, we won't waste GA operations on issues.
Signed-off-by: Christina Zhu <christina@anyscale.com>
Currently, our stalebot uses the Probot stalebot, which has been deprecated in favor of the GitHub Actions "Close Stale Issues" action. I think it is a good idea to deprecate the unsupported stalebot and move our config over to the supported GH Stalebot.
We also had internal discussions and made some changes to our stalebot settings which I will elaborate on below:
Changes:
Run Frequency
Runs once a day at midnight UTC. This should be enough given our PR frequency.
Issues
Issues will no longer be considered stale.
PR
Pull requests will be considered stale after 14 days of no activity.
After 14 more days, (28 days total), it will be closed due to inactivity.
---------
Signed-off-by: Christina Zhu <christina@christina-anyscale-laptop.local>
Signed-off-by: Christina Zhu <christina@anyscale.com>
Co-authored-by: Christina Zhu <christina@christina-anyscale-laptop.local>
Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com>