2.1 The Contents of Open Data
Open data include various sources, including the Open Government Data (OGD), Open Industrial Data(OID), Open Enterprise Data(OED) and Open Personal Data(OPD). The OGD is the major part of open data because governments have accumulated a large amounts of data, have become the largest owner of data in terms of volume. Currently governments are assumed to publish open data to maximize public reuse, not only strengthen the transparency and promote efficiency and effectiveness in administration, but also to create economic opportunity and improve citizens' quality of life(QoL). The OGD includes geographical, environmental, weather, education, agriculture, and occupational safety as well as economic data, which help citizens to be more informed, and makes the government more efficient.
Open scientific data is another important source of open data, including experimental data, genomes,chemical compounds, mathematical and scientific formulae, medical data practice, bioscience biodiversity. Most of these fundamental researches are financed by governments and are funded for the purpose of disclosure of their works and face little limit for openness. Problems often arise in open industrial and enterprise data because these data are commercially valuable orcan be aggregated into works of value. In these cases, access to, or reuse of the data is controlled by organizations, including access restrictions, licenses, copyright, patents and charges for access or reuse. It is important that the data are re-usable without requiring further permission though the types of reuse (such as the creation of derivative works) may be controlled by a license. Open personal data is also used in research projects. Companies like Microsoft and Yahoo investigate their consumer internet behavior in accordance with their respective user approval policies.
It is important to note that data management from new aspects, especially, anonymization is an essential from a viewpoint of achieving open data management in smart sustainable city.
Various institutions such as medical facilities, transportation facilities, and government agencies must manage large amounts of data, which may include private customer information, medical records, and transaction information. This data, commonly stored in electronic form, often contains sensitive personal information. These types of data are useful in smart sustainable city establishments, and are frequently necessary, to facilitate the provision of advanced services. However, stored data may contain a considerable amount of personal and sensitive information about individuals. This information may include age, addresses as well as more sensitive items such as financial data, medical records, personal preferences and history of behavior. In the interest of the individuals, it is essential that the data containing sensitive information should be protected from unauthorized use.
Contrastingly, the organizations should provide the privacy data by transferring the data to useful services in smart sustainable cities. In providing the data, preserving privacy at a required level should be prioritized. It has the possibility that the two different data, generated by suitable anonymization process to the required anonymization level, reveal the original plain data. This situation can be harmful in publishing data even if the data is appropriately anonymized. Moreover, the following information is needed from the viewpoint of both original data provider and application servicer using the data. In providing data as open data, this anonymization process is indispensable to maximize the use of private data as rich services.
Contents of data
Services or applications of smart sustainable cities mainly focuses on contents of data. The data can be separated into two parts: header as an index or tags from the viewpoint of data management and contents. Contents are the main part of data. For the entity providing an application and its user, the type of information included is considered important. The valuable application of the data strongly depends on this information.
Ownership of data
For the application service provider of the data and its user, the data has to be reliable and used to address pertinent concerns. Data providers have to give the name of the organization in order to add value to the data. In short, the way to authorize the owner of the data is another issue for data services in a smart sustainable city.
Generation date/time and expiration of data
For the application servicer provider of the data and its user, the date and time for when the data was generated is valuable information to determine the relevance and freshness of the data. In some cases, data analysis in its historical trend or changes is achieved. The date when the data was generated is fundamental information for this analysis. In addition to the information, the expiration date and time is required. Traffic information and market information can be the examples of data that the expiration date and time is indispensable.
Update of the contents
Some data of smart sustainable city requires continuous updating. Instead of static information, historical information requires to be updated to keep the data fresh. The frequency or interval of this update may influence valuability of the data.
Anonymizer of the data
From the application service provider of the data, and its user, the information regarding who or which organization anonymized the data is required to know whether the data is trustworthy and can be used An authorized anonymization servicer provider should anonymize the data for maximizing the value of the anonymized data. Namely, the way to authorize the anonymization servicer is another concerning in data services in smart sustainable cities.
Anonymized date/time of data
The information of anonymized date/time may not be directly concerned with the applications. However, it is needed from the viewpoint of traceability of data processing. When some privacy pirating attacks occurs, the information becomes a significant source of its detection and prevention.
Anonymized method and level
In data anonymization, the method of anonymization and the level of the anonymization is matter of concern to the data application providers and users. The level of anonymization could impact on the information loss of data17. The information loss is an index of similarity with the original plain data. The information could be changed or lost in anonymizing data, and information loss gives the level of the change or loss. The existing anonymization method and its level will be described in section 8.
Share with your friends: |