NOW LET US – AI RAG SaaS Studio TP.HCM
NOW LET US
Digital Product Studio
Back to news
AGENTIC-SYSTEMS...1 min read

A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

Share
NOW LET US Article – A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

Researchers have developed a novel Transformer-based method integrated with Deep Reinforcement Learning (DRL) to solve the Open Shop Scheduling Problem (OSSP). This approach offers a highly scalable, learning-based alternative to traditional dispatching rules, generalizing effectively from small to large-scale industrial scenarios without retraining.

Computer Science > Artificial Intelligence

Title:A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

View PDFAbstract:The open shop scheduling problem (OSSP) arises in many industrial and service settings but remains computationally challenging as the number of jobs and machines increases. While exact methods quickly become intractable, classical dispatching rules and metaheuristics may require substantial tuning to maintain solution quality at large scales. This study develops a Transformer-based scheduling policy for OSSP using an encoder-decoder architecture with multi-head attention. The model is trained on Taillard benchmark instances (4x4, 5x5, 7x7, and 10x10) using only the processing-time matrix as input and produces feasible schedules with makespans typically within 15-30% of best-known values. To evaluate scalability, the trained policy is applied without retraining to randomly generated instances from 40x40 to 100x100 and compared against classical dispatching heuristics, including SPT, LPT, MWKR, and EST. Across these large instances, the Transformer achieved average gaps of 12.89-15.12% relative to a standard lower bound. Compared with EST, the Transformer remained competitive, typically within a modest margin, while substantially outperforming SPT and LPT. These results indicate that a Transformer policy trained on small OSSP instances can generalize to substantially larger problems and provide a feature-light, learning-based alternative to classical dispatching rules.

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

© 2026 Now Let Us. All rights reserved.

Source: arXiv cs.AI Recent

Advertisement
Ad slot ready: 5887729102

More in this category

NOW LET US Related – History of the Muddy Children Puzzle

agentic-systems

History of the Muddy Children Puzzle

A recent study traces the two-century history of the "Muddy Children Puzzle", a classic problem that inspired the development of epistemic logic in AI. The paper also introduces unique variations and a novel self-referential puzzle.

NOW LET US Related – Orchestra-o1: Omnimodal Agent Orchestration

agentic-systems

Orchestra-o1: Omnimodal Agent Orchestration

Orchestra-o1 is a breakthrough omnimodal agent orchestration framework designed to coordinate diverse inputs like text, image, audio, and video. Powered by the new DA-GRPO reinforcement learning approach, Orchestra-o1-8B achieves state-of-the-art performance among open-source omnimodal agents.

NOW LET US Related – UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems

agentic-systems

UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems

Researchers have proposed UP-NRPA, a novel online framework that enables Large Language Models to dynamically adapt dialogue strategies based on real-time user portraits. The framework achieves a 100% success rate in multiple dialogue tasks and significantly improves negotiation outcomes without requiring offline reinforcement learning.

NOW LET US Related – TrajGenAgent: A Hierarchical LLM Agent for Human Mobility Trajectory Generation

agentic-systems

TrajGenAgent: A Hierarchical LLM Agent for Human Mobility Trajectory Generation

Researchers have proposed TrajGenAgent, a hierarchical LLM-agent framework that generates realistic human mobility trajectories without model fine-tuning, addressing privacy and cost constraints in urban planning and epidemic control.

NOW LET US Related – From AGI to ASI

agentic-systems

From AGI to ASI

Over the last decade, building human-level artificial general intelligence has moved from far-fetched speculation to being a concrete next-decade target. This report investigates the transition from human-level AGI to artificial general superintelligence (ASI).

NOW LET US Related – Definitional alignment before capability alignment: a Design-Science framework for adjudicating claims about AGI

agentic-systems

Definitional alignment before capability alignment: a Design-Science framework for adjudicating claims about AGI

A new research paper proposes DAF-AGI, a design-science framework to resolve conflicting claims about the arrival of Artificial General Intelligence (AGI) by prioritizing definitional alignment over capability alignment.

EXPLORE TOPICS

Discover All Categories

Deep dive into the specific technology sectors that matter most to you.