By this, we mean that either the entire transaction takes place at once or doesn’t happen at all. There is no midway i.e. transactions do not occur partially. Each transaction is considered as one unit and either runs to completion or is not executed at all. It involves the following two operations.
—Abort: If a transaction aborts, changes made to database are not visible.
—Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but before write(Y)), then amount has been deducted from X but not added to Y. This results in an inconsistent database state. Therefore, the transaction must be executed in entirety in order to ensure correctness of database state.
This means that integrity constraints must be maintained so that the database is consistent before and after the transaction. It refers to the correctness of a database. Referring to the example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a result T is incomplete.
This property ensures that multiple transactions can occur concurrently without leading to the inconsistency of database state. Transactions occur independently without interference. Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed. This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.
Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of operations takes place due to which T’’ reads correct value of X but incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in isolation and changes should be visible only after they have been made to the main memory.
This property ensures that once the transaction has completed execution, the updates and modifications to the database are stored in and written to disk and they persist even if a system failure occurs. These updates now become permanent and are stored in non-volatile memory. The effects of the transaction, thus, are never lost.
The ACID properties, in totality, provide a mechanism to ensure correctness and consistency of a database in a way such that each transaction is a group of operations that acts a single unit, produces consistent results, acts in isolation from other operations and updates that it makes are durably stored
2.Compare and contrast the two-tier client-server architecture for traditional DBMSs with the threetier client-server architecture. Why is the latter architecture more appropriate for the Web?
A DDBMS is stand for distributed database management system.In DDBMS the data is not store at a single location rather it data may be store in multiple computer atthe same place or geographically spread for away. Despite all this the distributed database appear as a single database to the user.A DDBS is a collection of multiple, logically inter-related database, distributed over a computernetwork.A distributed database MANAGEMENT is a Software that manage the distributed database andprovide the Access Mechanism That makes the Distribution transparent to the user.
Functionality of DDBMS
2. Keeping track of Data.
3. Replicated data management.
4. Distributed transaction management.
5. Distributed database recovery.
6. Query optimization to find the better access strategy.
7. Concurrency control.
8. Database Administration.
1. Data Security-: if there would be a nature disaster such as fire or earthquake, all the data would not be lost because data is stored in multiple location.
2. Database can also be easily increase or decreases.
3. If some of the data is node go offline rest of the database can continue at normal function.
4. Faster data Access.
5. Faster data processing.
1. It is difficult to provide security as the database needs to be secured in all location.
2. Difficult to maintain data integrity.
3. There can also be redundancy in the database as it stored in multiple databases.
4. If storage is increased and the database infrastructure needs to be upgrade.
5. Complexity of management and control.
4. What layers of transparency should be provided with a DDBMS?
The definition of and DDBMS defines that the system should make the distribution transparent to the user. Transparent hides implementation details from the user. For example, in a centralized DBMS, data independence is a form of transparency it hides changes in the definition and organization of the data from the user. A DDBMS may provide a various· levels of transparency. However, they all participate in the same overall objective: to make the use of the distributed database, equivalent to that of a centralized database.
We can identify four main types of transparency in a DDBMS:
• Distribution transparency
• Transaction transparency
• Performance transparency;
• DBMS transparency.
Distribution transparency allows the user to perceive the database as a single, logical entity. If add BMS exhibits distribution transparency, then the user does not need· to know the data is fragrances (fragmentation transparency) or the location of data items (Local transparency).
Distribution transparency can be classified into:
• Fragmentation transparency
• Location transparency
• Replication transparency
• Local Mapping transparency
• Naming transparency
Fragmentation is the highest level of distribution transparency. If fragmentation transparency is provided by the DDBMS, then the user does not need to know that the data is fragmented, As a result, database accesses are based on the global schema, so the user does not need to specify fragment names or data locations.
Location is the middle level of distribution transparency. With location transparency, the user must know how the data has been fragmented but still does not have to know the location of the data.
Closely related to location transparency is replication transparency, which means that the user is unaware of the replication of fragments. Replication transparency is implied' by location transparency.
Local mapping transparency:
This is the lowest level of distribution transparency. With local mapping transparency, user needs to specify both fragment names and the location of data items, taking into consideration any replication that may exists.
Clearly, this is a more complex and time-consuming query for the user to enter than the first. It is unlikely that a system that provides only this level of transparency would be acceptable to end-users.
As a corollary to the above distribution transparencies, we have naming transparency.
As in a centralized database, each item in a distributed database must. have a unique name. Therefore, the DDBMS must ensure that no t\vo sites create a database object with the same name. One solution to this problem is to create a central name server, which has the responsibility for ensure uniqueness of all names in the system. However, this approach results in:
• Loss of some local autonomy;
• Perfoffi1ance problems, if the central site becomes a bottleneck;
• Low availability; .if the central site fails the remaining sites cannot create any .new database objects.
An alternatively solution is to prefix an object , with the identifier of the site that created it For example, the relation branch created at site S1 might be named S1.Branch. Similarly, we need to be able to identify each fragment and each of its copies. ·Thus, copy 2 of fragment 3 of the Branch relation created at site 81 might be referred to as SI.Branch.F3.C2. However, this results in loss of distribution transparency.
An approach that resolves the problems with both these solution uses aliases (sometimes called synonyms) for each database object. Thus, S I.Brauch.F3 .C2 might be known as Local Branch by the user at site 51. The DDBMS has the task of mapping an alias to the appropriate database object.
Transaction transparency in a DDBMS environment ensures that all distributed transactions maintain the· distributed database's integrity and consistency. A distributed transaction accesses data stored at· more than one location. Each transaction is divided into a number of sub transactions one for each site that has to be accessed; a sub transaction is represented by an agent.
The DDBMS must also ensure the atomicity of each sub transaction. Transaction transparency in a distributed DBMS is complicated by the fragmentation, allocation and replication schenlas.
Performance transparency requires a DDBMS to perform as if it were a centralized DBMS. In a distributed environment, the system should suffer any performance degradation due to the distributed architecture, for example the presence of the network Performance transparency also requires the DDBMS to determine the most cost-effective strategy to execute a request.
In a centralized DBMS, the query processor (QP) must evaluate every data request and find an optimal execution strategy, consisting of an ordered sequence of operations on the database. In a distributed environment, the distributed query processor (DQP) maps a data request into an ordered sequence of operations on the local databases. It has the added complexity of taking into account the fragmentation, replication and allocation schemas. The DQP has to decide:
• Which fragment to access?
• Which copy of fragment to use, if the fragment is replicated?
• Which location to use.
The DQP produces an execution strategy that is optimized with respect to some cost function. Typically, the costs associated with a distributed request include:
• The access time (I/O) cost involved in accessing the physical data on disk;
• The CPU time cost incurred when performing operations on data in main memory;
• The communication cost associated with the transmission of data across the network.
The first two factors are the only ones considered in a centralized system. In· a distributes environment, the DDBMS must take account of the communication cost, which may be the most dominant factor in WANs with a bandwidth of a few kilobytes per second. In such cases, optimization may ignore I/O and CPU costs. However, LANs have a bandwidth comparable to that of disks, so in such cases optimization should not ignore I/O and CPU costs entirely.
DBMS transparency hides the knowledge that the local DBMSs may be different, and is therefore only applicable to heterogeneous DDBMSs. It is one of the most difficult transparencies to provide as a generalization.
5. what is NoSql? What are the advantages, limitations and disadvantages of NOSQL?
NoSQL are type of databases created in the late 90s to solve these problems, called like that because they didn’t use SQL (but today they are called “Not Only SQL” due to some Management Systems which implement Query Languages). NoSQL databases mostly address some of the points: being non-relational, distributed, open-source and horizontally scalable.
It is important to mention that nowadays Relational Databases have improved dramatically, having resolved most of the problems they had when dealing with today's technology. NoSQL Databases are another way of storing data, not necessarily better than Relational Databases. Both are designed to resolve different kinds of needs.
Regardless of these obstacles, NoSQL databases have been widely adopted in many enterprises for the following reasons:
1. Elastic scalability
RDBMSs are not as easy to scale out on commodity clusters, whereas NoSQL databases are made for transparent expansion, taking advantage of new nodes. These databases are designed for use with low-cost commodity hardware. In a world where upward scalability is being replaced by outward scalability, NoSQL databases are a better fit.
2. Big data applications
Given that transaction rates are growing from recognition, there is need to store massive volumes of data. While RDBMSs have grown to match the growing needs, but it’s difficult to realistically use one RDBMS to manage such data volumes. These volumes are however easily handled by NoSQL databases.
3. Database administration
The best RDBMSs require the services of expensive administrators to design, install and maintain the systems. On the other hand, NoSQL databases require much less hands-on management, with data distribution and auto repair capabilities, simplified data models and fewer tuning and administration requirements. However, in practice, someone will always be needed to take care of performance and availability of databases.
RDBMSs require installation of expensive storage systems and proprietary servers, while NoSQL databases can be easily installed in cheap commodity hardware clusters as transaction and data volumes increase. This means that you can process and store more data at much less cost.
Disadvantages of no SQL
1. Less mature
RDBMSs have been around a lot longer than NoSQL databases. The first RDBMS was released into the market about 25 years ago. While proponents of NoSQL may present this as a disadvantage citing that age is an indicator of obsolescence, with the advancement of years RDBMSs have matured to become richly functional and stable systems.
In contrast, most of the NoSQL database alternatives have just barely made it out of the pre-production stages, and there are many important features that have not yet been implemented. It’s an exciting prospect for a developer to be teetering on the cutting edge of technology, but caution must be exercised to avoid any disastrous consequences.
2. Less support
All enterprises need to have the reassurance that should a key function within their data management system fail, they will have access to competent support in a timely manner. All the RDMBS vendors have made great effort to ensure that such services are available, and enterprises can also enlist 24 hour support from remote database administration services, which have the expertise to handle most of the RDBMSs.
Each NoSQL database in contrast tends to be open-source, with just one or two firms handling the support angle. Many of them have been developed by smaller startups which lack the resources to fund support on a global scale, and also the credibility that the established RDBMS vendors like Oracle, IBM and Microsoft enjoy.
3. Business intelligence and analytics
NoSQL databases were created with the demands of the Web 2.0 modern-day web applications in mind. As such, most features are directed at meeting these demands. Where the demands of a data app extend beyond the characteristic ‘insert-read-update-delete’ cycle of a typical web app, these databases offer few features for analysis and query ad-hoc.
Simple queries require some programming knowledge, and the most common business intelligence tools that many enterprises rely on do not offer connectivity to NoSQL databases. However, this may be solved in time, seeing as some tools like PIG or HIVE have been created to offer ad-hoc query functionality for NoSQL databases.
The end goal for NoSQL database design was to offer a solution that would require no administration, but the reality on the ground is much different. NoSQL databases still demand a lot of technical skill with both installation and maintenance.
5. No advanced expertise
Because NoSQL databases are still new, virtually every NoSQL developer out there is still learning the ropes, unlike RDBMS systems, which have millions of proficient developers throughout the market and in every field of trade. Over time, this situation will resolve itself, but presently, it remains easier to find an RDBMS expert than a NoSQL expert.
Any organization that wants to implement NoSQL solutions needs to proceed with caution, bearing in mind the above limitations in addition to understanding the benefits that NoSQL databases offer their relational counterparts.