Tutorials ========= Getting Started ^^^^^^^^^^^^^^^ Welcome to this tutorial on how to install and use **DockOnSurf** to help you find the most stable geometry for molecules on surfaces. As an example, we will use DockOnSurf to investigate the adsorption of isopropanol on a (100) γ-Al₂O₃ surface using **VASP**. Goals of the Tutorial ---------------------- This tutorial will have three main goals: 1. **Installing DockOnSurf**: We will cover how to install DockOnSurf from the Git repository and initialize the Conda environment. 2. **Creating Input Files**: You will learn how to build simple DockOnSurf input files for the different run types implemented in the software. 3. **Practical Application**: Finally, we will use DockOnSurf to perform various types of geometric optimization series, including isopropanol conformer sampling, followed by the generation and optimization of the adsorbate-surface geometry for isopropanol on the γ-alumina surface. Installing DockOnSurf ^^^^^^^^^^^^^^^^^^^^^ Step 1: Clone the Repository and Set Up the PATH ------------------------------------------------ First, download the DockOnSurf Git repository: :: git clone https://github.com/your-repository-url/dockonsurf.git Next, add the DockOnSurf directory to your `PATH`. For example, you can add this line to your `.bashrc` file: :: export PATH="$HOME/dockonsurf:$PATH" Replace `/PATH/dockonsurf` with the actual path to the DockOnSurf directory on your system. Reload your shell to apply the changes: :: source ~/.bashrc Step 2: Install Conda --------------------- If you haven’t already installed Conda, you can use either **Miniconda** or **Miniforge**: - Miniconda: https://docs.anaconda.com/miniconda/install/ - Miniforge: https://conda-forge.org/download/ Once installed, open your terminal. On a UNIX system, you should see `(base)` at the beginning of your prompt, indicating that Conda has been successfully installed: :: (base) user@hostname:~$ If `(base)` does not appear, you may need to add Conda initialization to your .bashrc. Step 3: Create and Activate the Conda Environment ------------------------------------------------- Navigate to the `examples` directory in DockOnSurf: :: cd /PATH/dockonsurf/examples Create the Conda environment using the provided `.yml` file: :: conda env create -f dockonsurf.yml Follow the instructions in the terminal to complete the setup. After the installation, activate the new environment: :: conda activate dockonsurf Your terminal should now show `(dockonsurf)`: :: (dockonsurf) user@hostname:~$ This means you have successfully installed and activate the correct Python environment compatible with DockOnSurf. You are now ready to run the software. If you are not familiar with python environment you can take a look at : https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html Creating Input Files ^^^^^^^^^^^^^^^^^^^^ In this tutorial, we will present examples for the three different types of runs that are feasible with **DockOnSurf** using **VASP**. We highly recommend reading the section **Input File Reference Manual** to gain a comprehensive understanding of all the keywords used in DockOnSurf. Additionally, while VASP input files are necessary to run the simulations, we will only provide a brief description of them. For a full understanding of VASP input files, we strongly recommend consulting the official VASP manual: `https://www.vasp.at/wiki/index.php/The_VASP_Manual `_. Frist, from the Git repository, navigate to the `tutorial` directory in DockOnSurf: :: cd /PATH/dockonsurf/tutorial Inside the `tutorial` directory, you will find the following subdirectories: :: prep_isolated prep_screening prep_refinement tools These subdirectories contain all the files necessary to run the three different run types in sequence: **Isolated**, **Screening** and **Refinement** to find stable geometry of **isopropanol** on the **(100) γ-Al₂O₃ surface**. It is beyond the scope of this tutorial to explain the whole strategy of the thrre different run_types and recommand to read the dockonsurf release article : https://pubs.acs.org/doi/10.1021/acs.jcim.1c00256. **Tools Directory**: The `tools` directory contains simple bash and Python scripts that will be used during the analysis. prep_isolated -------------- The **isolated** step explores the conformational space of the isopropanol. Inside the `prep_isolated` directory, you will find the following five files: :: INCAR KPOINTS POSCAR POTCAR dockonsurf_isolated_vasp.inp The last file, `dockonsurf_isolated_vasp.inp`, is the input file for the `dockonsurf` tool, which is used to perform the isolated procedure. This input file is divided into two sections: :: [Global] project_name = Isolated_isopropanol run_type = isolated code = VASP batch_q_sys = slurm subm_script = VASP.sh potcar_dir = /vasp/PSEUDOS_DATABASIS/XXXX/gw/potpaw/pbe/ [Isolated] isol_inp_file = INCAR POTCAR KPOINTS molec_file = POSCAR num_conformers = 30 pre_opt = MMFF The input file contains two blocks: the **Global** section and the **Isolated** section. ### Global Section From the **Global** section, we see that the computation is configured to run on a cluster using the Slurm job submission system, with a script named `VASP.sh`. This script must be in the same directory and will handle VASP calculations. To proceed with the tutorial, you need to ensure the input file is correctly configured for your specific job submission system and adjust the `potcar_dir` value. - If the `POTCAR` file is already provided, you can omit the `potcar_dir` setting. - Otherwise, ensure `potcar_dir` points to the correct repository containing pseudopotentials. ### Isolated Section The number of raw conformers to be generated is set to 30 (num_conformers) in this tutorial to reduce computational cost and coherently with the size of the isppropanol. For production investigations with larger molecule, a higher number of conformers (e.g., up to 100) is recommended for better sampling of the conformational space. Before running the geometry optimization with VASP, the isopropanol is pre-optimized using the **MMFF** force field. The **Isolated** block also specifies the files needed to run the VASP geometrical optimization with : - **INCAR**: Defines the calculation parameters, settings, and convergence criteria. In this tutorial, it is configured for a simple geometry optimization to quickly obtain results while minimizing computational cost. For production runs, we recommend adjusting the level of theory for higher accuracy. - **KPOINTS**: Specifies the k-point sampling in reciprocal space. For a non-periodic system like a single molecule, the grid is set to 1x1x1 to avoid unnecessary sampling. - **POSCAR**: Contains the atomic positions and the simulation cell dimensions of isopropanol. - **POTCAR**: Provides the pseudopotentials for the atoms in the molecule. prep_screening --------------- The **screening** step explores the adsorbate-surface configurational space by combining the conformational sampling from the isolated step with orientational and translational sampling. Inside the `prep_screening` directory, you will find the following five files: :: INCAR KPOINTS POSCAR POTCAR dockonsurf_euler_screening.inp The DockOnSurf input file `dockonsurf_euler_screening.inp` for the screening procedure is divided into two sections: :: [Global] project_name = Screening_isopropanol run_type = screening code = VASP batch_q_sys = slurm subm_script = VASP.sh potcar_dir = /vasp/PSEUDOS_DATABASIS/XXXX/gw/potpaw/pbe/ [Screening] screen_inp_file = INCAR KPOINTS POTCAR surf_file = POSCAR select_magns = energy confs_per_magn = 1 adsorption_height = 3 set_angles = euler sample_points_per_angle = 3 surf_norm_vect = z min_coll_height= 2.5 sites = 62, 59, 54, 45, (59 62), (54, 45), (45, 54, 59) molec_ctrs = 3, 4, (3 4) h_donor = O h_acceptor = O max_structures = 200 ### Global Section The **Global** section is similar to that in the isolated step, except that the `run_type` is set to `screening`. ### Screening Section The **Screening** section introduces several new parameters. As it uses the structures generated during the isolated procedure, you do not need to specify the isopropanol `POSCAR`. .. note:: It is possible to select a different directory from a previous computation as input for the conformers. However, you must provide the `POSCAR` file for the surface, which in this case is the (100) γ-Al₂O₃ surface, as described in the study: `10.1016/j.molcata.2009.01.024`. We recommend consulting the input file reference manual for detailed descriptions of all the keywords. .. note:: Python indexing starts at 0, so this must be accounted for when specifying adsorption centers/anchoring points on the surface and molecule. To generate diverse conformers while keeping computational time reasonable, we limit the output to 200 conformers and configure the `INCAR` file with low-to-moderate levels of theory and accuracy for fast results. (For production calculations, higher accuracy settings are recommended.) Finally, the `KPOINTS` file for VASP computations is set to a 3x3x1 grid. prep_refinement ---------------- The **refinement** step processes the results obtained from the screening procedure by extracting all structures with relative energies below a specified cutoff compared to the most stable structure. These selected structures are then recomputed at a higher level of theory and accuracy. If you navigate inside the `prep_refinement` directory, you will find the following files: :: INCAR KPOINTS POTCAR dockonsurf_euler_refinement.inp The DockOnSurf input file contains a **Global** section similar to the previous steps, except for the `run_type` and `project_name`. In the **Refinement** section, we find: :: [Refinement] refine_inp_file = INCAR KPOINTS POTCAR energy_cutoff = 10.0 The **Refinement** section specifies: - The `INCAR`, `KPOINTS`, and `POTCAR` input files for the VASP calculations. Since the structures are extracted from the screening procedure, no `POSCAR` file is required. - An `energy_cutoff` value (in this case, 10.0 eV), which determines the threshold for selecting strtcture to refine from the screening results. .. note:: The `energy_cutoff` in production is usually much smaller due to higher accuracy during the screening procedure. It is set relatively high in the tutorial to ensure enough structures are available for comparison. Thus, the main difference from the screening computations is the `INCAR` file, which is adjusted for higher accuracy and allows for more ionic relaxation steps to ensure well-converged results. Practical Applications ^^^^^^^^^^^^^^^^^^^^^^^ Run Isolated ------------- Great! Now that you are familiar with the DockOnSurf input files, let's begin the practical part of the tutorial. We will start by running the isolated procedure. To do so, copy all the files from the `prep_isolated` directory into the main `tutorial` directory. All the runs should be executed in the same directory, as new subdirectories will be created to store the conformers used in subsequent steps. If you are currently in the `tutorial` directory, use the following command to copy the files: :: cp prep_isolated/* . This will copy all the necessary files into the `tutorial` directory. Once the files are in place, run the isolated procedure with the following command: :: dockonsurf.py -i dockonsurf_isolated_vasp.inp If the procedure starts correctly, you should see the following message: :: Running DockOnSurf. To check DockOnSurf activity see '/home/PATH/tutorial/dockonsurf.log'. You can monitor the execution by checking the `dockonsurf.log` file, which provides updates about the process. The procedure should finish with the message: :: XX-Month-XX 00:00:00-INFO: DockOnSurf finished. During the run, a directory named `isolated` will be created, containing 30 subdirectories named `conf_X` (with X ranging from 0 to 29), corresponding to the 30 raw conformers. Once the computations are complete, ensure that all calculations finished correctly. Navigate to the `tools` directory and use the `extract_CPU.sh` bash script. Adjust the target directory to `isolated` if it is not already set. This script searches for the string `Total CPU time used (sec):` in all VASP `OUTCAR` files and generates a `CPU.dat` file containing the name of each conformer and its associated CPU time. If a conformer's CPU time is missing, the computation did not finish correctly. If you want to examine the different conformers, you can also use the `extract_energy.sh` script, setting the target directory to `isolated`. This script extracts the last `E0` values from each `OSZICAR` file and stores them in a file named `E0.dat`. For example, the following image shows the lowest-energy conformer obtained during the sampling: .. image:: isopropanol.png :alt: Lowest-energy conformer of isopropanol :align: center After confirming that all calculations completed successfully, you can proceed to the screening procedure. Run screening ------------- For the next step, you will need to replace the input files from the isolated run with the files for the screening run. If you wish to retain the isolated run files, we recommend storing them elsewhere. To proceed with the screening run, copy the files from the `prep_screening` directory. Assuming you are in the `tutorial` directory, use the following command: :: cp prep_screening/* . This will copy the necessary files to run the screening procedure. You can launch the run with the following command: :: dockonsurf.py -i dockonsurf_euler_screening.inp You should see the following output: :: Running DockOnSurf. To check DockOnSurf activity see '/home/PATH/tutorial/dockonsurf.log'. As before, you can monitor the run by checking the `dockonsurf.log` file. The run will create 200 adsorbate-surface conformers in a new directory named `screening`, with subdirectories named `conf_X` (where X ranges from 0 to 199). Once all the VASP calculations are finished, you can perform a quick verification using the `extract_CPU.sh` script to ensure all calculations completed correctly. In this example, we observed that some conformers (around 10) did not finish correctly. Here is an example of two conformers that did not finish correctly: .. image:: BadSurface.png :alt: Example of incorrect conformers :align: center As we can see, even though the chosen molecule center (the alcohol group of isopropanol) is at the specified distance from the surface, the exploration of the conformational space with our current settings leads to overlapping structures between the adsorbate and the surface. To reduce the number of such collisions, several options can be considered. For instance, enabling the collision_threshold keyword (see documentation) could help, though it may limit the exploration of the orientational space. Morevoer, in systems where anchoring points are expected, using the internal set of angles instead of the Euler angles can improve efficiency (e.g., see the Results and Discussion section of the DockOnSurf article). After identifying the structures that did not finish correctly, they should be removed from the `screening` directory. Otherwise, the refinement step will produce an error message indicating that one or more runs did not complete successfully. Next, you can analyze the energies of the adsorbate-surface structures using the `extract_energy.sh` script. After generating the `E0.dat` file, a simple Python script named `extract_lowest_diff.py` can be used to print the lowest-energy structure and display the minimum energy differences. .. warning:: The energy from the VASP computations is reported in Hartree, while DockOnSurf uses eV as the unit of energy. In this example, since we did not specify highly accurate or high-level geometry optimizations, large relative energy differences can be expected. In our run, we obtained the following results: :: Five lowest energy differences (reference: -17964.52991662): conf_18: Energy = -17964.52991662, Difference = 0.00000000 conf_106: Energy = -17959.06831964, Difference = 5.46159698 conf_105: Energy = -17958.45170966, Difference = 6.07820696 conf_52: Energy = -17957.52978796, Difference = 7.00012866 conf_173: Energy = -17956.20241662, Difference = 8.32750000 conf_17: Energy = -17955.92213936, Difference = 8.60777726 conf_16: Energy = -17955.85764838, Difference = 8.67226824 conf_188: Energy = -17955.54417323, Difference = 8.98574339 conf_101: Energy = -17955.23124231, Difference = 9.29867431 conf_156: Energy = -17955.06742977, Difference = 9.46248685 conf_54: Energy = -17954.88538561, Difference = 9.64453101 conf_64: Energy = -17954.28156499, Difference = 10.24835163 ... To ensure sufficient conformers for the refinement step, we set `energy_cutoff = 10.0` in the DockOnSurf input file, allowing several conformers to be investigated during the refinement. After removing, the conf_X directories that failed, and set up correctly the dockonsurf input file for the refinement, you can advance to the final step. Run Refinement -------------- As with the previous steps, you need to replace the input files with those from the `prep_refinement` directory. Assuming you are in the `tutorial` directory, use the following command: :: cp prep_refinement/* . This will copy the necessary files to run the refinement procedure. You can launch the refinement run with the following command: :: dockonsurf.py -i dockonsurf_euler_refinement.inp This should start the refinement run, and you will see the "Running DockOnSurf" message, along with the creation of log files. The run will generate a folder named `refinement`, containing all the `conf_X` directories (where X corresponds to the systems meeting the relative energy cutoff). Since the level of theory and accuracy is increased in the refinement step, these calculations may take significantly longer. With the same submission script setup, the runtime for refinement can be an order of magnitude longer than for screening, as per the tutorial configuration. Finally, verify that all calculations finished correctly. In the tutorial, all computations completed successfully but reached the maximum number of ionic relaxation steps (`NSW`). As a result, the structures may not be fully optimized. However, to limit CPU consumption for the tutorial, we will analyze the results obtained at this stage without further refinement. The following figure presents ten refined structures obtained after the refinement procedure, with the lowest-energy structure highlighted by a green rectangle: .. image:: refined_structure.png :alt: Example of refined adsorbate-surface geometries :align: center If someone is specifically interested in the adsorption of isopropanol on the (100) γ-Al₂O₃ surface, the theoretical article available at `https://doi.org/10.1016/j.molcata.2009.01.024` provides a useful comparison. From the simple series performed during this tutorial, we obtained results similar to those shown in the referenced article. This demonstrates how DockOnSurf can be used effectively to investigate stable geometries for molecules on surfaces.