[SPARK-56415][INFRA] Refactor create_spark_jira.py for LLM use and extract shared utilities#55281
[SPARK-56415][INFRA] Refactor create_spark_jira.py for LLM use and extract shared utilities#55281cloud-fan wants to merge 12 commits intoapache:masterfrom
Conversation
…A ticket creation Co-authored-by: Isaac
- Remove CLI version flag; auto-detect latest unreleased version instead - Split preflight into check_jira_access() and detect_affected_version() - Hint in AGENTS.md to review versions after ticket creation Co-authored-by: Isaac
…ssues only Co-authored-by: Isaac
There was a problem hiding this comment.
I can add back the old script if people still need it, but with a different name, as create_spark_jira.py should only create ticket.
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
|
Thank you for pinging me. Let me try today. |
|
|
||
| parser = argparse.ArgumentParser(description="Create a Spark JIRA issue.") | ||
| parser.add_argument("title", nargs="?", help="Title of the JIRA issue") | ||
| parser.add_argument("-p", "--parent", help="Parent JIRA ID for subtasks") |
There was a problem hiding this comment.
I'm alway using this parent JIRA ID feature. Please recover this, @cloud-fan .
Co-authored-by: Isaac
…xclusive Co-authored-by: Isaac
Co-authored-by: Isaac
dongjoon-hyun
left a comment
There was a problem hiding this comment.
I've been considering this improvement. It would be great to preserve all existing features because it's a fallback for non-LLM-installed environment or users, @cloud-fan .
I can add back the old script if people still need it, but with a different name, as create_spark_jira.py should only create ticket.
In addition, it would be great if we can keep the existing file name because create_spark_jira.py is used conventionally by multiple committers across multiple Spark sub-projects already.
- https://github.com/apache/spark/blob/master/dev/create_spark_jira.py
- https://github.com/apache/spark-kubernetes-operator/blob/main/dev/create_spark_jira.py
- https://github.com/apache/spark-connect-swift/blob/main/dev/create_spark_jira.py
It's the same for merge_spark_pr.py:
- https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py
- https://github.com/apache/spark-kubernetes-operator/blob/main/dev/merge_spark_pr.py
- https://github.com/apache/spark-connect-swift/blob/main/dev/merge_spark_pr.py
This PR had better narrow down toward teaching LLMs to skip some existing human-oriented features which is not required for LLM-environment.
|
I'm fine to create a new script for this llm use case dedicatedly. But I'm confused about the naming suggestion. This simply creates Spark JIRA ticket but we can't name it |
Co-authored-by: Isaac
…anch.py Co-authored-by: Isaac
…ils.py Co-authored-by: Isaac
There was a problem hiding this comment.
@dongjoon-hyun the old script is kept with a more accurate name, as it does more than jira creation. dev/spark_jira_utils.py is created to share code between the two scripts.
There was a problem hiding this comment.
PR description also updated.
What changes were proposed in this pull request?
Refactors
dev/create_spark_jira.pyinto a lightweight LLM-friendly script, extracts shared JIRA utilities intodev/spark_jira_utils.py, and preserves the original interactive script asdev/create_jira_and_branch.py.Changes:
dev/spark_jira_utils.py(new): shared module containingget_jira_client(),detect_affected_version(),list_components(), andcreate_jira_issue().dev/create_spark_jira.py(simplified): stripped down to only create a JIRA ticket and print the key. Made-c(component) required, added--list-componentsflag, added--parent/--typemutual exclusivity validation, and improved error messages whenjiralibrary orJIRA_ACCESS_TOKENis missing.dev/create_jira_and_branch.py(new file, old behavior): the original script that creates a JIRA ticket, checks out a branch, and creates an initial commit — now importing shared utilities fromspark_jira_utils.py.CLAUDE.md: updated instructions for LLM agents to use the simplified script.Why are the changes needed?
Previously,
CLAUDE.mdtold LLM agents to ask the user to create JIRA tickets manually. Thecreate_spark_jira.pyscript existed but included interactive prompts and git side effects (branch creation, committing) that made it unsuitable for automated use. This change makes the script LLM-friendly while preserving the original interactive workflow in a separate script, with shared JIRA logic extracted to avoid duplication.Does this PR introduce any user-facing change?
No. The original interactive workflow is preserved in
dev/create_jira_and_branch.py.How was this patch tested?
#55280 was created with this new prompt.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code