1. Find top three categories under "conditions" which have the largest number of US clinical trials, link [url removed, login to view]
2. Search by the three categories found in Step 1, link: [url removed, login to view]
3. Download all the studies found by the search as XML. Need all 20 available fields of each study in the download window(use the option of "download selected fields" in this example [url removed, login to view]+attack&show_down=Y). link: [url removed, login to view], use "Download a List of Records" method in the link. Also check out "Download Multiple Records in XML" in the link. Do whichever one is easier. Use the schema [url removed, login to view] if necessary.
4. Final dataset should have one trial per row. If one study has several trials, then that study should have multiple rows with each represents a trial phase.
Note: I haven't fully explored the data myself. It might be the case that you need to use [url removed, login to view] to retrieve historical data using NCT as a search criteria. The field I will focus on is "sponsor", "collaborator", "investigator". Please write down the steps you take in detail so I can replicate and verify.