Your company uses a proprietary system to send

This preview shows page 7 - 9 out of 21 pages.

We have textbook solutions for you!
The document you are viewing contains questions related to this textbook.
Concepts of Database Management
The document you are viewing contains questions related to this textbook.
Chapter 9 / Exercise 4
Concepts of Database Management
Last/Pratt
Expert Verified
21.Your company uses a proprietary system to send inventory data every 6 hours to a data ingestion service in the cloud. Transmitted data includes a payload of several fields and the timestamp of the transmission. If there are any concerns about a transmission, the system re-transmits the data. How should you deduplicate the data most efficiency? A. Assign global unique identifiers (GUID) to each data entry. B. Compute the hash value of each data entry, and compare it with all historical data. C. Store each data entry as the primary key in a separate database and apply an index. D. Maintain a database table to store the hash value and other metadata for each data entry. Answer:D
22.Your company has hired a new data scientist who wants to perform complicated analyses across very large datasets stored in Google Cloud Storage and in a Cassandra cluster on Google Compute Engine. The scientist primarily wants to create labelled data sets for machine learning projects, along with some visualization tasks. She reports that her laptop is not powerful enough to perform her tasks and it is slowing her down. You want to help her perform her tasks. What should you do?
23.You are deploying 10,000 new Internet of Things devices to collect temperature data in your
We have textbook solutions for you!
The document you are viewing contains questions related to this textbook.
Concepts of Database Management
The document you are viewing contains questions related to this textbook.
Chapter 9 / Exercise 4
Concepts of Database Management
Last/Pratt
Expert Verified
8/ 20
What should you do?
24.You have spent a few days loading data from comma-separated values (CSV) files into the Google BigQuery table CLICK_STREAM. The column DT stores the epoch time of click events. For convenience, you chose a simple schema where every field is treated as the STRING type. Now, you want to compute web session durations of users who visit your site, and you want to change its data type to the TIMESTAMP. You want to minimize the migration effort without making future queries computationally expensive. What should you do? A. Delete the table CLICK_STREAM, and then re-create it such that the column DT is of the TIMESTAMP type. Reload the data.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture