Data vs. Information
- simple, unorganized and uninterpreted facts – stored representations of events and objects.
- normally represented by numbers, characters, pictures , symbols. (e.g. student names, sales figures).
- data that has been structured and can be processed in such a way that it becomes meaningful.
- typically summarised and in report form.
- Describes properties of data
- Describes properties or characteristics of the data, including data types, field sizes, allowable values, and documentation.
Traditional file-based systems
- A computer program has its own set of directly controlled data files.
- Open, read from and write to the data files directly, without any intervening “layers”.
- Data dependence.
- Integrated collection of logically related data.
- Integrated: single unit, not divided into separate or disconnected parts.
- Logically related: all about a common theme.
- Sharable and accessible: available and accessible by all authorised users and programs.
- Programs no longer accessed data files directly.
- Data independence – DBMS deals with such details as the record structure so that the program or users don’t need to know it.
Database Management System
- Create, maintain and provide controlled access to user databases.
- Systematic method of creating, updating, storing, and retrieving data in a database.
- Controlling data access, enforcing data integrity, managing concurrency control, and restoring a database.
Advantages of the Database Approach
- Programs are independent of data files
- Improved data consistency
- Improved sharing of data
- Reduced data redundancy
Components of a Database Environment
- CASE Tools – automated tools for design of DB and application programs
- Repository – Centralised store for data definitions, relationships, screen formats etc.
- Application programs – programs to create and maintain DB information.
- User interface – languages and menus etc. to permit users to interact with the DB.
- Data and database administrators
Database vs. Repository
- Database contains occurrences of data.
- Repository contains definitions of data.
Types of Databases
- Classification criteria
- Based on the levels in an organization.
- Based on the technology model.
- Based on the uses of databases.
- Based on the data location.
- Homogeneous and heterogeneous databases.
ANSI/SPARC Three Schema Architecture for DB Development
- The DBMS “hid” the physical location or structure of the data.
- Databases generally follow the ANSI/SPARC architecture.
- Also known as three-schema architecture.
- Describes important separation of levels.
- Consists of user views
- Each user view is a logical description of some portiion of data required by users to perform a task.
- Logical description – doesn’t involve physical storage information.
- External level
- Detailed logical specification of all data in the organisation or database.
- Each user view would be a subset of the conceptual schema.
- Independent of any storage technology.
- Conceptual level.
- Logical and Physical schema
- Logical schema = representation of data for a specific DBMS.
- Physical schema = how data is represented or stored in physical storage.
- Internal level
- Process of conceiving, abstracting, naming and representing the data that must be recorded.
- Entity-relationship modeling.
- Semantic object modeling.
- Models entities and relationships (and attributes).
Modeling Data Structure
- Do not model:
- Business processes
- Data flow
- Forms or reports
- Individual tasks
Data Structures vs. Flow vs. State
- ER diagrams model data structure – the static structure of data items and the relationships between them.
- DFD – data flow, it shows what data moves from one process to another.
- State change diagram – state changes, possible events and the change in state that the events cause.
- A person, place, object, event or concept about which the organisatioin wishes to maintain data.
Entity Instance vs. Entity Type
- Student – entity type
- A single student – instance of student
- Course is an entity type
- Course instance: CourseID, CourseName
Attributes vs Values
- Entity types have attributes
- Entity instances have values
- User interviews, examining existing processes, looking at forms
- Looking for nouns
- Finding objects or things that have many instances.
- Descriptive or defining properties
- Instances are uniquely identifiable
- uniquely identify every entity instance
- A meaningful association between one or more entity instances.
Minimum Cardinality (inner mark)
- Defines whether every instance of an entity type must participate in a relationship (one) or the participation is optional (zero)
Maximum Cardinality (outer mark)
- Defines if an instance participates in a relationships with more than one instances of the other entity type.
- Can be used to replace M:N relationships only.
- All relationships for participating entities are many.
- Resulting associative entity has independent meaning to end users.
- Has one or more attributes in addition to identifier.
- Participates in one or more relationships independent of entities related in associated relationship.
- A statement that defines or constrains some aspects of the business.
- Indicate entities, relationships, attributes to be recorded.
- Appropriate names for entities, etc.
- Constraints on entities, relationships and etc.
Strong vs. Weak Entity Types
- Strong entity
- one that exists independently of other entities.
- own identifier
- Weak entity
- one that has no meaning to the business without a strong entity on which it depends.
- do not have a complete identifier of their own
Required vs. Optional Attributes
- Optional attributes such as email address.
- Required attributes – each instance of an entity type must possess the attribute and value must be known, value can change.
- Optional attributes – some instances of an entity type may not possess the attribute or the value may not be known.
Single vs. Multi-level Attributes
- Single-valued attributes – An attribute that may take on at most one value for any given entity instance. (e.g. StudentID)
- Multi-valued attributes – An attribute that may take on more than one value for a given entity instance, curly brackets denotes a multi-valued attributes. (e.g. ContactPhone)
Simple vs. Composite Attributes
- Attributes can be made up of other attributes.
- Address – Street, City, State and Postcode.
- Composite attributes – represented with round brackets.
Stored vs. Derived Attributes
- Derived attributes – e.g. record date of birth and work out or derive the age, shown using square brackets.
- Number of entity types participating in a relationship.
- Cardinality is not relevant when working out the degree.
- Degree of 1 – involves only one entity type.
Ternary & n-ary Relationships
- Ternary relationships – degree three
- Higher degree relationships – n-ary relationships.
Supertypes & Subtypes
- Subtype – subgrouping of the instances in an entity type that is meaningful to the organisation.
- Supertype – generic entity that has a relationship with one or more subtypes.
- No unique identifier in either of the subtypes – subtypes inherit all the attributes from the supertype, including the identifier (attribute inheritance)
Generalisation & Specialisation
- Generalisation – process of defining a more general entity type from a set of more specialised entity types.
- Specialisation – process of defining one or more subtypes from a given supertypes.
Total vs. Partial Specialisation
- Total Specialisation – double line indicates that the complete set of subtypes is shown, no third subtype
- Partial Specialisation – same but not all of the subtypes are shown (single line)
Disjoint vs. Overlapping Subtypes
- Overlapping subtype – The O tells us that it is an overlapping subtype.
- Discriminator – attribute in the supertype (D)
- Cannot be both a postgraduate and undergraduate.
- The “d” tells us that it is a distinct subtype.
- entities – classes
- entity instances – objects
- Classes can additionally show operations
- Relationships are called associations
- Cardinality is called multiplicity