| | --- |
| | license: bsd-3-clause |
| | tags: |
| | - InvertedDoublePendulum-v2 |
| | - reinforcement-learning |
| | - decisions |
| | - TLA |
| | - deep-reinforcement-learning |
| | model-index: |
| | - name: TLA |
| | results: |
| | - metrics: |
| | - type: mean_reward |
| | value: 9356.67 |
| | name: mean_reward |
| | - type: Action Repetition |
| | value: .7522 |
| | name: Action Repetition |
| | - type: Average Decisions |
| | value: 247.76 |
| | name: Average Decisions |
| | task: |
| | type: OpenAI Gym |
| | name: OpenAI Gym |
| | dataset: |
| | name: InvertedDoublePendulum-v2 |
| | type: InvertedDoublePendulum-v2 |
| | Paper: https://arxiv.org/abs/2305.18701 |
| | Code: https://github.com/dee0512/Temporally-Layered-Architecture |
| | --- |
| | # Temporally Layered Architecture: InvertedDoublePendulum-v2 |
| |
|
| | These are 10 trained models over **seeds (0-9)** of **[Temporally Layered Architecture (TLA)](https://github.com/dee0512/Temporally-Layered-Architecture)** agent playing **InvertedDoublePendulum-v2**. |
| |
|
| | ## Model Sources |
| |
|
| | **Repository:** [https://github.com/dee0512/Temporally-Layered-Architecture](https://github.com/dee0512/Temporally-Layered-Architecture) |
| | **Paper:** [https://doi.org/10.1162/neco_a_01718](https://doi.org/10.1162/neco_a_01718) |
| | **Arxiv:** [arxiv.org/abs/2305.18701](https://arxiv.org/abs/2305.18701) |
| |
|
| | # Training Details: |
| | Using the repository: |
| |
|
| | ``` |
| | python main.py --env_name <environment> --seed <seed> |
| | ``` |
| |
|
| | # Evaluation: |
| |
|
| | Download the models folder and place it in the same directory as the cloned repository. |
| | Using the repository: |
| |
|
| | ``` |
| | python eval.py --env_name <environment> |
| | ``` |
| |
|
| | ## Metrics: |
| |
|
| | **mean_reward:** Mean reward over 10 seeds |
| | **action_repeititon:** percentage of actions that are equal to the previous action |
| | **mean_decisions:** Number of decisions required (neural network/model forward pass) |
| | |
| | |
| | # Citation |
| | |
| | The paper can be cited with the following bibtex entry: |
| | |
| | ## BibTeX: |
| | |
| | ``` |
| | @article{10.1162/neco_a_01718, |
| | author = {Patel, Devdhar and Sejnowski, Terrence and Siegelmann, Hava}, |
| | title = "{Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures}", |
| | journal = {Neural Computation}, |
| | pages = {1-30}, |
| | year = {2024}, |
| | month = {10}, |
| | issn = {0899-7667}, |
| | doi = {10.1162/neco_a_01718}, |
| | url = {https://doi.org/10.1162/neco\_a\_01718}, |
| | eprint = {https://direct.mit.edu/neco/article-pdf/doi/10.1162/neco\_a\_01718/2474695/neco\_a\_01718.pdf}, |
| | } |
| | ``` |
| | |
| | ## APA: |
| | ``` |
| | Patel, D., Sejnowski, T., & Siegelmann, H. (2024). Optimizing Attention and Cognitive Control Costs Using Temporally Layered Architectures. Neural Computation, 1-30. |
| | ``` |