2990

It is designed to evaluate the candidate’s understanding on scientific computational and her capability of problem solving. Hint: all questions are open questions. You can use any programming language or existing scientific math libraries. Please provide some information on these languages or libraries, if you decide to use them.
Assume that you have to deal with a big file (over 3 GB) which contains around 10 Million records of scientific data entries. Each data entry contains 6 different fields, including date (string), location (two doubles, longitude, latitude) value 1 (double, such as temperature), and value 2 (double, such as concentration), and a short description (string
You are required to design an efficient data structure (i.e. tree or any data structure) and strategy to manage the data file for follow-up operations.

  1. Find all the entries which are located within a user defined region (Longitude 1, Longitude 2, Latitude 1, Latitude 2). Please also describe your method/strategy to estimate the execution time of your operations.
  1. If you are accessible to a cluster with 64 CPUs, please describe your methods/strategies to parallelize your algorithm in Question 1. (Hint, it is fine to redesign your data structure for this question separately).
  1. Extra question (optional): If you have found over 2 million records through Step 1, please develop an efficient way to calculate the relationship between value 1 (temperature) and value 2 (concentration) in each season. (Hint Winter: December to February, Spring: March to May, Summer: June to August, and Fall, September to November)

Document Preview:

Question: Computational Science It is designed to evaluate the candidate’s understanding on scientific computational and her capability of problem solving. Hint: all questions are open questions. You can use any programming language or existing scientific math libraries. Please provide some information on these languages or libraries, if you decide to use them. Assume that you have to deal with a big file (over 3 GB) which contains around 10 Million records of scientific data entries. Each data entry contains 6 different fields, including date (string), location (two doubles, longitude, latitude) value 1 (double, such as temperature), and value 2 (double, such as concentration), and a short description (string

Attachments:

Question-CS-P….docx