devdharpatel
/

SAC-InvertedDoublePendulum-v2

Reinforcement Learning

InvertedDoublePendulum-v2

Soft Actor Critic

deep-reinforcement-learning

Eval Results (legacy)

Model card Files Files and versions

SAC-InvertedDoublePendulum-v2 / README.md

devdharpatel's picture

Create README.md

f803acd verified 7 months ago

|

history blame contribute delete

2.73 kB

	---
	license: bsd-3-clause
	tags:
	- InvertedDoublePendulum-v2
	- reinforcement-learning
	- Soft Actor Critic
	- SRL
	- deep-reinforcement-learning
	model-index:
	- name: SAC
	results:
	- metrics:
	- type: FAS (J=1)
	value: 0.02178 ± 0.000199
	name: FAS
	- type: FAS (J=2)
	value: 0.073121 ± 0.000214
	name: FAS
	- type: FAS (J=4)
	value: 0.089067 ± 0.01022
	name: FAS
	- type: FAS (J=8)
	value: 0.014685 ± 0.000134
	name: FAS
	- type: FAS (J=16)
	value: 0.014057 ± 0.000458
	name: FAS
	task:
	type: OpenAI Gym
	name: OpenAI Gym
	dataset:
	name: InvertedDoublePendulum-v2
	type: InvertedDoublePendulum-v2
	Paper: https://arxiv.org/pdf/2410.08979
	Code: https://github.com/dee0512/Sequence-Reinforcement-Learning
	---
	# Soft-Actor-Critic: InvertedDoublePendulum-v2

	These are 25 trained models over seeds (0-4) and J = 1, 2, 4, 8, 16 of Soft actor critic agent playing InvertedDoublePendulum-v2 for [Sequence Reinforcement Learning (SRL)](https://github.com/dee0512/Sequence-Reinforcement-Learning).

	## Model Sources

	Repository: [https://github.com/dee0512/Sequence-Reinforcement-Learning](https://github.com/dee0512/Sequence-Reinforcement-Learning)
	Paper (ICLR): [https://openreview.net/forum?id=w3iM4WLuvy](https://openreview.net/forum?id=w3iM4WLuvy)
	Arxiv: [arxiv.org/pdf/2410.08979](https://arxiv.org/pdf/2410.08979)

	# Training Details:
	Using the repository:

	```
	python .\train_sac.py --env_name <env_name> --seed <seed> --j <j>
	```

	# Evaluation:

	Download the models folder and place it in the same directory as the cloned repository.
	Using the repository:

	```
	python .\eval_sac.py --env_name <env_name> --seed <seed> --j <j>
	```

	## Metrics:

	FAS: Frequency Averaged Score
	j: Action repetition parameter


	# Citation

	The paper can be cited with the following bibtex entry:

	## BibTeX:

	```
	@inproceedings{PatelS25,
	author = {Devdhar Patel and
	Hava T. Siegelmann},
	title = {Overcoming Slow Decision Frequencies in Continuous Control: Model-Based
	Sequence Reinforcement Learning for Model-Free Control},
	booktitle = {The Thirteenth International Conference on Learning Representations,
	{ICLR} 2025, Singapore, April 24-28, 2025},
	publisher = {OpenReview.net},
	year = {2025},
	url = {https://openreview.net/forum?id=w3iM4WLuvy}
	}
	```

	## APA:
	```
	Patel, D., & Siegelmann, H. T. Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. In The Thirteenth International Conference on Learning Representations.
	```