Archive for June 18th, 2009

Human–computer interaction

From Wikipedia, the free encyclopedia

  (Redirected from Human-computer interaction)
Jump to: navigation, search

This article is about the interaction between users and computers. For the direct communication between brain cells and computers, please see Brain-computer interface.

This article’s external links may not follow Wikipedia’s content policies or guidelines. Please improve this article by removing excessive or inappropriate external links.

Human–computer interaction (HCI) is the study of interaction between people (users) and computers. It is often regarded as the intersection of computer science, behavioral sciences, design and several other fields of study. Interaction between users and computers occurs at the user interface (or simply interface), which includes both software and hardware, for example, general-purpose computer peripherals and large-scale mechanical systems, such as aircraft and power plants. The following definition is given by the Association for Computing Machinery[1]:

“Human-computer interaction is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them.”

Because human-computer interaction studies a human and a machine in conjunction, it draws from supporting knowledge on both the machine and the human side. On the machine side, techniques in computer graphics, operating systems, programming languages, and development environments are relevant. On the human side, communication theory, graphic and industrial design disciplines, linguistics, social sciences, cognitive psychology, and human performance are relevant. Engineering and design methods are also relevant. Due to the multidisciplinary nature of HCI, people with different backgrounds contribute to its success. HCI is also sometimes referred to as man–machine interaction (MMI) or computer–human interaction (CHI).



[edit] Goals

A basic goal of HCI is to improve the interactions between users and computers by making computers more usable and receptive to the user’s needs. Specifically, HCI is concerned with:

  • methodologies and processes for designing interfaces (i.e., given a task and a class of users, design the best possible interface within given constraints, optimizing for a desired property such as learning ability or efficiency of use)
  • methods for implementing interfaces (e.g. software toolkits and libraries; efficient algorithms)
  • techniques for evaluating and comparing interfaces
  • developing new interfaces and interaction techniques
  • developing descriptive and predictive models and theories of interaction

A long term goal of HCI is to design systems that minimize the barrier between the human’s cognitive model of what they want to accomplish and the computer’s understanding of the user’s task.

Professional practitioners in HCI are usually designers concerned with the practical application of design methodologies to real-world problems. Their work often revolves around designing graphical user interfaces and web interfaces.

Researchers in HCI are interested in developing new design methodologies, experimenting with new hardware devices, prototyping new software systems, exploring new paradigms for interaction, and developing models and theories of interaction.

[edit] Differences with related fields

HCI differs with human factors in that there is more of a focus on users working with computers rather than other kinds of machines or designed artifacts, and an additional focus on how to implement the (software and hardware) mechanisms behind computers to support human-computer interaction. It means that human factors is a broader term, and human factors of computers can be described as HCI – albeit some experts try to differ these areas.

About some experts’ opinion, HCI also differs with ergonomics in that there is less of a focus on repetitive work-oriented tasks and procedures, and much less emphasis on physical stress and the physical form or industrial design of physical aspects of the user interface, such as the physical form of keyboards and mice. However, it is caused by the lack of information about ergonomics. The oldest areas of ergonomics were the above mentioned, but nowadays (in the last decades), ergonomics have a much broader focus: ergonomics is equal to human factors. Cognitive ergonomics is a part of ergonomics, and the old-fashioned term software ergonomics signs a part of cognitive ergonomicssoftware ergonomics means HCI.

More discussion of the nuances between these fields is at [1]

Three areas of study have substantial overlap with HCI even as the focus of inquiry shifts. In the study of personal information management (PIM) human interactions with the computer are placed in a larger informational context – people may work with many forms of information, some computer-based, many not (e.g., white boards, notebooks, sticky notes, refrigerator magnets) in order to understand and effect desired changes in their world. In computer supported cooperative work (CSCW) emphasis is placed on the use of computing systems in support of the collaborative work of a group of people. The principles of human interaction management (HIM) extend the scope of CSCW to organizational level and can be implemented without use of computer systems.

[edit] Design principles

When evaluating a current user interface, or designing a new user interface, it is important to keep in mind the following experimental design principles:

  • Early focus on user(s) and task(s): Establish how many users are needed to perform the task(s) and determine who the appropriate users should be; someone that has never used the interface, and will not use the interface in the future, is most likely not a valid user. In addition, define the task(s) the users will be performing and how often the task(s) need to be performed.
  • Empirical measurement: Test the interface early on with real users who come in contact with the interface on an everyday basis, respectively. Keep in mind that results may be altered if the performance level of the user is not an accurate depiction of the real human-computer interaction. Establish quantitative usability specifics such as: the number of users performing the task(s), the time to complete the task(s), and the number of errors made during the task(s).
  • Iterative design: After determining the users, tasks, and empirical measurements to include, perform the following iterative design steps:
  1. Design the user interface
  2. Test
  3. Analyze results
  4. Repeat

Repeat the iterative design process until a sensible, user-friendly interface is created.[2]

[edit] Design methodologies

A number of diverse methodologies outlining techniques for human–computer interaction design have emerged since the rise of the field in the 1980s. Most design methodologies stem from a model for how users, designers, and technical systems interact. Early methodologies, for example, treated users’ cognitive processes as predictable and quantifiable and encouraged design practitioners to look to cognitive science results in areas such as memory and attention when designing user interfaces. Modern models tend to focus on a constant feedback and conversation between users, designers, and engineers and push for technical systems to be wrapped around the types of experiences users want to have, rather than wrapping user experience around a completed system.

  • User-centered design: user-centered design (UCD) is a modern, widely practiced design philosophy rooted in the idea that users must take center-stage in the design of any computer system. Users, designers and technical practitioners work together to articulate the wants, needs and limitations of the user and create a system that addresses these elements. Often, user-centered design projects are informed by ethnographic studies of the environments in which users will be interacting with the system. This practice is similar, but not identical to Participatory Design, which emphasizes the possibility for end-users to contribute actively through shared design sessions and workshops.
  • Principles of User Interface Design: these are seven principles that may be considered at any time during the design of a user interface in any order, namely Tolerance, Simplicity, Visibility, Affordance, Consistency, Structure and Feedback.[3]

[edit] Display design

Displays are human-made artifacts designed to support the perception of relevant system variables and to facilitate further processing of that information. Before a display is designed, the task that the display is intended to support must be defined (e.g. navigating, controlling, decision making, learning, entertaining, etc.). A user or operator must be able to process whatever information that a system generates and displays; therefore, the information must be displayed according to principles in a manner that will support perception, situation awareness, and understanding.


These principles of human perception and information processing can be utilized to create an effective display design. A reduction in errors, a reduction in required training time, an increase in efficiency, and an increase in user satisfaction are a few of the many potential benefits that can be achieved through utilization of these principles.

Certain principles may not be applicable to different displays or situations. Some principles may seem to be conflicting, and there is no simple solution to say that one principle is more important than another. The principles may be tailored to a specific design or situation. Striking a functional balance among the principles is critical for an effective design. [5]

Perceptual Principles

1. Make displays legible (or audible)

A display’s legibility is critical and necessary for designing a usable display. If the characters or objects being displayed cannot be discernible, then the operator cannot effectively make use of them.

2. Avoid absolute judgment limits

Do not ask the user to determine the level of a variable on the basis of a single sensory variable (e.g. color, size, loudness). These sensory variables can contain many possible levels.

3. Top-down processing

Signals are likely perceived and interpreted in accordance with what is expected based on a user’s past experience. If a signal is presented contrary to the user’s expectation, more physical evidence of that signal may need to be presented to assure that it is understood correctly.

4. Redundancy gain

If a signal is presented more than once, it is more likely that it will be understood correctly. This can be done by presenting the signal in alternative physical forms (e.g. color and shape, voice and print, etc.), as redundancy does not imply repetition. A traffic light is a good example of redundancy, as color and position are redundant.

5. Similarity causes confusion: Use discriminable elements

Signals that appear to be similar will likely be confused. The ratio of similar features to different features causes signals to be similar. For example, A423B9 is more similar to A423B8 than 92 is to 93. Unnecessary similar features should be removed and dissimilar features should be highlighted.

Mental Model Principles

6. Principle of pictorial realism

A display should look like the variable that it represents (e.g. high temperature on a thermometer shown as a higher vertical level). If there are multiple elements, they can be configured in a manner that looks like it would in the represented environment.

7. Principle of the moving part

Moving elements should move in a pattern and direction compatible with the user’s mental model of how it actually moves in the system. For example, the moving element on an altimeter should move upward with increasing altitude.

Principles Based on Attention

8. Minimizing information access cost

When the user’s attention is averted from one location to another to access necessary information, there is an associated cost in time or effort. A display design should minimize this cost by allowing for frequently accessed sources to be located at the nearest possible position. However, adequate legibility should not be sacrificed to reduce this cost.

9. Proximity compatibility principle

Divided attention between two information sources may be necessary for the completion of one task. These sources must be mentally integrated and are defined to have close mental proximity. Information access costs should be low, which can be achieved in many ways (e.g. close proximity, linkage by common colors, patterns, shapes, etc.). However, close display proximity can be harmful by causing too much clutter.

10. Principle of multiple resources

A user can more easily process information across different resources. For example, visual and auditory information can be presented simultaneously rather than presenting all visual or all auditory information.

Memory Principles

11. Replace memory with visual information: knowledge in the world

A user should not need to retain important information solely in working memory or to retrieve it from long-term memory. A menu, checklist, or another display can aid the user by easing the use of their memory. However, the use of memory may sometimes benefit the user rather than the need for reference to some type of knowledge in the world (e.g. a expert computer operator would rather use direct commands from their memory rather than referring to a manual). The use of knowledge in a user’s head and knowledge in the world must be balanced for an effective design.

12. Principle of predictive aiding

Proactive actions are usually more effective than reactive actions. A display should attempt to eliminate resource-demanding cognitive tasks and replace them with simpler perceptual tasks to reduce the use of the user’s mental resources. This will allow the user to not only focus on current conditions, but also think about possible future conditions. An example of a predictive aid is a road sign displaying the distance from a certain destination.

13. Principle of consistency

Old habits from other displays will easily transfer to support processing of new displays if they are designed in a consistent manner. A user’s long-term memory will trigger actions that are expected to be appropriate. A design must accept this fact and utilize consistency among different displays.

[edit] Future developments

The means by which humans interact with computers continues to evolve rapidly. Human–computer interaction is affected by the forces shaping the nature of future computing. These forces include:

  • Decreasing hardware costs leading to larger memories and faster systems
  • Miniaturization of hardware leading to portability
  • Reduction in power requirements leading to portability
  • New display technologies leading to the packaging of computational devices in new forms
  • Specialized hardware leading to new functions
  • Increased development of network communication and distributed computing
  • Increasingly widespread use of computers, especially by people who are outside of the computing profession
  • Increasing innovation in input techniques (i.e., voice, gesture, pen), combined with lowering cost, leading to rapid computerization by people previously left out of the “computer revolution.”
  • Wider social concerns leading to improved access to computers by currently disadvantaged groups

The future for HCI is expected to include the following characteristics:

Ubiquitous communication Computers will communicate through high speed local networks, nationally over wide-area networks, and portably via infrared, ultrasonic, cellular, and other technologies. Data and computational services will be portably accessible from many if not most locations to which a user travels.

High functionality systems Systems will have large numbers of functions associated with them. There will be so many systems that most users, technical or non-technical, will not have time to learn them in the traditional way (e.g., through thick manuals).

Mass availability of computer graphics Computer graphics capabilities such as image processing, graphics transformations, rendering, and interactive animation will become widespread as inexpensive chips become available for inclusion in general workstations.

Mixed media Systems will handle images, voice, sounds, video, text, formatted data. These will be exchangeable over communication links among users. The separate worlds of consumer electronics (e.g., stereo sets, VCRs, televisions) and computers will partially merge. Computer and print worlds will continue to cross assimilate each other.

High-bandwidth interaction The rate at which humans and machines interact will increase substantially due to the changes in speed, computer graphics, new media, and new input/output devices. This will lead to some qualitatively different interfaces, such as virtual reality or computational video.

Large and thin displays New display technologies will finally mature enabling very large displays and also displays that are thin, light weight, and have low power consumption. This will have large effects on portability and will enable the development of paper-like, pen-based computer interaction systems very different in feel from desktop workstations of the present.

Embedded computation Computation will pass beyond desktop computers into every object for which uses can be found. The environment will be alive with little computations from computerized cooking appliances to lighting and plumbing fixtures to window blinds to automobile braking systems to greeting cards. To some extent, this development is already taking place. The difference in the future is the addition of networked communications that will allow many of these embedded computations to coordinate with each other and with the user. Human interfaces to these embedded devices will in many cases be very different from those appropriate to workstations.

Augmented reality A common staple of science fiction, augmented reality refers to the notion of layering relevant information into our vision of the world. Existing projects show real-time statistics to users performing difficult tasks, such as manufacturing. Future work might include augmenting our social interactions by providing additional information about those we converse with.

Group interfaces Interfaces to allow groups of people to coordinate will be common (e.g., for meetings, for engineering projects, for authoring joint documents). These will have major impacts on the nature of organizations and on the division of labor. Models of the group design process will be embedded in systems and will cause increased rationalization of design.

User Tailorability Ordinary users will routinely tailor applications to their own use and will use this power to invent new applications based on their understanding of their own domains. Users, with their deeper knowledge of their own knowledge domains, will increasingly be important sources of new applications at the expense of generic systems programmers (with systems expertise but low domain expertise).

Information Utilities Public information utilities (such as home banking and shopping) and specialized industry services (e.g., weather for pilots) will continue to proliferate. The rate of proliferation will accelerate with the introduction of high-bandwidth interaction and the improvement in quality of interfaces.

[edit] Human–computer interface

The human–computer interface can be described as the point of communication between the human user and the computer. The flow of information between the human and computer is defined as the loop of interaction. The loop of interaction has several aspects to it including:

  • Task Environment: The conditions and goals set upon the user.
  • Machine Environment: The environment that the computer is connected to, i.e a laptop in a college student’s dorm room.
  • Areas of the Interface: Non-overlapping areas involve processes of the human and computer not pertaining to their interaction. Meanwhile, the overlapping areas only concern themselves with the processes pertaining to their interaction.
  • Input Flow: Begins in the task environment as the user has some task that requires using their computer.
  • Output: The flow of information that originates in the machine environment.
  • Feedback: Loops through the interface that evaluate, moderate, and confirm processes as they pass from the human through the interface to the computer and back.

[edit] Academic conferences

One of the top academic conferences for new research in human-computer interaction, especially within computer science, is the annually held ACM‘s Conference on Human Factors in Computing Systems, usually referred to by its short name CHI (pronounced kai, or khai). CHI is organized by ACM SIGCHI Special Interest Group on Computer–Human Interaction. CHI is a large, highly competitive conference, with thousands of attendants, and is quite broad in scope.

There are also dozens of other smaller, regional or specialized HCI-related conferences held around the world each year, the most important of which include (see also [2]):

[edit] Special purpose

  • UIST: ACM Symposium on User Interface Software and Technology.
  • CSCW: ACM conference on Computer Supported Cooperative Work.
  • ECSCW: European Conference on Computer-Supported Cooperative Work. Alternates yearly with CSCW.
  • ICMI: International Conference on Multimodal Interfaces.
  • MobileHCI: International Conference on Human-Computer Interaction with Mobile Devices and Services.
  • DIS: ACM conference on Designing Interactive Systems.
  • NIME: International Conference on New Interfaces for Musical Expression.
  • HRI: ACM/IEEE International Conference on Human-robot interaction.
  • IUI: International Conference on Intelligent User Interfaces.
  • Ubicomp: International Conference on Ubiquitous computing
  • ASSETS: ACM International Conference on Computers and Accessibility

[edit] Regional and general HCI

  • INTERACT: IFIP TC13 International Conference on Human-Computer Interaction. Biennial, alternating years with AVI.
  • AVI: International Working Conference on Advanced Visual Interfaces. Held biennially in Italy, alternating years with INTERACT.
  • HCI International: International Conference on Human-Computer Interaction.
  • ACHI: International Conferences on Advances in Human-Computer Interaction.
  • HCI: British HCI Conference.
  • OZCHI: Australasian HCI Conference.
  • IHM: Annual French-speaking HCI Conference.
  • Graphics Interface: Annual Canadian computer graphics and HCI conference. The oldest regularly scheduled conference for graphics and human-computer interaction.
  • NordiCHI: Nordic Conference on Human-Computer Interaction. Biennial.

[edit] See also

[edit] Footnotes

This article includes a list of references or external links, but its sources remain unclear because it has insufficient inline citations. Please help to improve this article by introducing more precise citations where appropriate. (April 2009)

  1. ^ ACM SIGCHI Curricula for Human-Computer Interaction
  2. ^ Green, Paul (2008). Iterative Design. Lecture presented in Industrial and Operations Engineering 436 (Human Factors in Computer Systems, University of Michigan, Ann Arbor, MI, February 4, 2008.
  3. ^ Pattern Language
  4. ^ Wickens, Christopher D., John D. Lee, Yili Liu, and Sallie E. Gordon Becker. An Introduction to Human Factors Engineering. Second ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2004. 185–193.
  5. ^ Brown, C. Marlin. Human-Computer Interface Design Guidelines. Intellect Books, 1998. 2–3.

[edit] Further reading

[edit] External links

Read Full Post »

Mathematical proof

Mathematical proof

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In mathematics, a proof is a convincing demonstration (within the accepted standards of the field) that some mathematical statement is necessarily true[1]. Proofs are obtained from deductive reasoning, rather than from inductive or empirical arguments. That is, a proof must demonstrate that a statement is true in all cases, without a single exception. An unproved proposition that is believed to be true is known as a conjecture.

The statement that is proved is often called a theorem[1]. Once a theorem is proved, it can be used as the basis to prove further statements. A theorem may also be referred to as a lemma, especially if it is intended for use as a stepping stone in the proof of another theorem.

Proofs employ logic but usually include some amount of natural language which usually admits some ambiguity. In fact, the vast majority of proofs in written mathematics can be considered as applications of rigorous informal logic. Purely formal proofs, written in symbolic language instead of natural language, are considered in proof theory. The distinction between formal and informal proofs has led to much examination of current and historical mathematical practice, quasi-empiricism in mathematics, and so-called folk mathematics (in both senses of that term). The philosophy of mathematics is concerned with the role of language and logic in proofs, and mathematics as a language.



[edit] History and etymology

For more details on this topic, see History of logic.

Plausibility arguments using heuristic devices such as pictures and analogies preceded strict mathematical proof..[2] The early history of the concept of proof dates back to the early Greek and Chinese civilisations. Thales (624–546 BCE) proved some theorems in geometry. Eudoxus (408–355 BCE) and Theaetetus (417–369 BCE) formulated theorems but did not prove them. Aristotle (384–322 BCE) said definitions should describe the concept being defined in terms of other concepts already known. Euclid (300 BCE) began with undefined terms and axioms (propositions regarding the undefined terms assumed to be self-evidently true, from the Greek “axios” meaning “something worthy”) and used these to prove theorems using deductive logic. Modern proof theory treats proofs as inductively defined data structures. There is no longer an assumption that axioms are “true” in any sense; this allows for parallel mathematical theories built on alternate sets of axioms (see Axiomatic set theory and Non-Euclidean geometry for examples).

The word Proof comes from the Latin probare meaning “to test”. Related modern words are the English “probe”, “proboscis”, “probation”, and “probability”, and the Spanish “probar” (to smell or taste, or (lesser use) touch or test). The early use of “probity” was in the presentation of legal evidence. A person of authority, such as a nobleman, was said to have probity, whereby the evidence was by his relative authority, which outweighed empirical testimony.[3]

[edit] Nature and purpose

There are two different conceptions of mathematical proof.[4] The first is an informal proof, a rigorous natural-language expression that is intended to convince the audience of the truth of a theorem. Because of their use of natural language, the standards of rigor for informal proofs will depend on the audience of the proof. In order to be considered a proof, however, the argument must be rigorous enough; a vague or incomplete argument is not a proof. Informal proofs are the type of proof typically encountered in published mathematics. They are sometimes called “formal proofs” because of their rigor, but logicians use the term “formal proof” to refer to a different type of proof entirely.

In logic, a formal proof is not written in a natural language, but instead uses a formal language consisting of certain strings of symbols from a fixed alphabet. This allows the definition of a formal proof to be precisely specified without any ambiguity. The field of proof theory studies formal proofs and their properties. Although each informal proof can, in theory, be converted into a formal proof, this is rarely done in practice. The study of formal proofs is used to determine properties of provability in general, and to show that certain undecidable statements are not provable.

A classic question in philosophy asks whether mathematical proofs are analytic or synthetic. Kant, who introduced the analytic-synthetic distinction, believed mathematical proofs are synthetic.

Proofs may be viewed as aesthetic objects, admired for their mathematical beauty. The mathematician Paul Erdős was known for describing proofs he found particularly elegant as coming from “The Book”, a hypothetical tome containing the most beautiful method(s) of proving each theorem. The book Proofs from THE BOOK, published in 2003, is devoted to presenting 32 proofs its editors find particularly pleasing.

[edit] Methods of proof

[edit] Direct proof

Main article: Direct proof

In direct proof[5], the conclusion is established by logically combining the axioms, definitions, and earlier theorems. For example, direct proof can be used to establish that the sum of two even integers is always even:

For any two even integers x and y we can write x = 2a and y = 2b for some integers a and b, since both x and y are multiples of 2. But the sum x + y = 2a + 2b = 2(a + b) is also a multiple of 2, so it is therefore even by definition.

This proof uses definition of even integers, as well as distribution law.

[edit] Proof by mathematical induction

In proof by mathematical induction[6], first a “base case” is proved, and then an “induction rule” is used to prove a (often infinite) series of other cases. Since the base case is true, the infinity of other cases must also be true, even if all of them cannot be proved directly because of their infinite number. A subset of induction is infinite descent. Infinite descent can be used to prove the irrationality of the square root of two.

The principle of mathematical induction states that: Let N = { 1, 2, 3, 4, … } be the set of natural numbers and P(n) be a mathematical statement involving the natural number n belonging to N such that

  • (i) P(1) is true, i.e., P(n) is true for n = 1
  • (ii) P(n + 1) is true whenever P(n) is true, i.e., P(n) is true implies that P(n + 1) is true.

Then P(n) is true for all natural numbers n.

Mathematicians often use the term “proof by induction” as shorthand for a proof by mathematical induction.[7] However, the term “proof by induction” may also be used in logic to mean an argument that uses inductive reasoning.

[edit] Proof by transposition

Main article: Transposition (logic)

Proof by transposition establishes the conclusion “if p then q” by proving the equivalent contrapositive statement “if not q then not p“.

[edit] Proof by contradiction

In proof by contradiction (also known as reductio ad absurdum, Latin for “by reduction toward the absurd”), it is shown that if some statement were so, a logical contradiction occurs, hence the statement must be not so. This method is perhaps the most prevalent of mathematical proofs. A famous example of a proof by contradiction shows that \sqrt{2} is an Irrational number:

Suppose that \sqrt{2} is a rational number, so \sqrt{2} = {a\over b} where a and b are non-zero integers with no common factor (definition of a rational number). Thus, b\sqrt{2} = a. Squaring both sides yields 2b2 = a2. Since 2 divides the left hand side, 2 must also divide the right hand side (as they are equal and both integers). So a2 is even, which implies that a must also be even. So we can write a = 2c, where c is also an integer. Substitution into the original equation yields 2b2 = (2c)2 = 4c2. Dividing both sides by 2 yields b2 = 2c2. But then, by the same argument as before, 2 divides b2, so b must be even. However, if a and b are both even, they share a factor, namely 2. This contradicts our assumption, so we are forced to conclude that \sqrt{2} is an irrational number.

[edit] Proof by construction

Main article: Proof by construction

Proof by construction, or proof by example, is the construction of a concrete example with a property to show that something having that property exists. Joseph Liouville, for instance, proved the existence of transcendental numbers by constructing an explicit example.

[edit] Proof by exhaustion

Main article: Proof by exhaustion

In proof by exhaustion, the conclusion is established by dividing it into a finite number of cases and proving each one separately. The number of cases sometimes can become very large. For example, the first proof of the four colour theorem was a proof by exhaustion with 1,936 cases. This proof was controversial because the majority of the cases were checked by a computer program, not by hand. The shortest known proof of the four colour theorem today still has over 600 cases.

[edit] Probabilistic proof

Main article: Probabilistic method

A probabilistic proof is one in which an example is shown to exist, with certainty, by using methods of probability theory. This is not to be confused with an argument that a theorem is ‘probably’ true. The latter type of reasoning can be called a ‘plausibility argument’ and is not a proof; in the case of the Collatz conjecture it is clear how far that is from a genuine proof.[8] Probabilistic proof, like proof by construction, is one of many ways to show existence theorems.

[edit] Combinatorial proof

Main article: Combinatorial proof

A combinatorial proof establishes the equivalence of different expressions by showing that they count the same object in different ways. Often a bijection between two sets is used to show that the expressions for their two sizes are equal. Alternatively, a double counting argument provides two different expressions for the size of a single set, again showing that the two expressions are equal.

[edit] Nonconstructive proof

Main article: Nonconstructive proof

A nonconstructive proof establishes that a certain mathematical object must exist (e.g. “Some X satisfies f(X)”), without explaining how such an object can be found. Often, this takes the form of a proof by contradiction in which the nonexistence of the object is proven to be impossible. In contrast, a constructive proof establishes that a particular object exists by providing a method of finding it. A famous example of a nonconstructive proof shows that there exist two irrational numbers a and b such that ab is a rational number:

Either \sqrt{2}^{\sqrt{2}} is a rational number and we are done (take a=b=\sqrt{2}), or \sqrt{2}^{\sqrt{2}} is irrational so we can write a=\sqrt{2}^{\sqrt{2}} and b=\sqrt{2}. This then gives \left (\sqrt{2}^{\sqrt{2}}\right )^{\sqrt{2}}=\sqrt{2}^{2}=2, which is thus a rational of the form ab.

[edit] Visual proof

Visual proof for the (3, 4, 5) triangle as in the Chou Pei Suan Ching 500–200 BC.

Although not a formal proof, a visual demonstration of a mathematical theorem is sometimes called a “proof without words“. The picture at right is an example of a historic visual proof of the Pythagorean theorem in the case of the (3,4,5) triangle.

[edit] Elementary proof

Main article: Elementary proof

An elementary proof is a proof which only uses basic techniques. More specifically, the term is used in number theory to refer to proofs that make no use of complex analysis. For some time it was thought that certain theorems, like the prime number theorem, could only be proved using “higher” mathematics. However, over time, many of these results have been reproved using only elementary techniques.

[edit] Two-column proof

A two-column proof published in 1913

A particular form of proof using two parallel columns is often used in elementary geometry classes.[9] The proof is written as a series of lines in two columns. In each line, the left hand column contains propositions (or sometimes called statements), while the right hand column contains a brief explanation of how this proposition is either an axiom, a hypothesis, or can be obtained from previous lines (or sometimes just called “reasons”).

[edit] Statistical proofs in pure mathematics

Main article: Statistical proof

The expression “statistical proof” may be used technically or colloquially in areas of pure mathematics, such as involving cryptography, chaotic series, and probabilistic or analytic number theory.[10][11][12] It is less commonly used to refer to a mathematical proof in the branch of mathematics known as mathematical statistics. See also “Statistical proof using data” section below.[4].

[edit] Computer-assisted proofs

Until the twentieth century it was assumed that any proof could, in principle, be checked by a competent mathematician to confirm its validity.[2] However, computers are now used both to prove theorems and to carry out calculations that are too long for any human or team of humans to check; the first proof of the four color theorem is an example of a computer-assisted proof. Some mathematicians are concerned that the possibility of an error in a computer program or a run-time error in its calculations calls the validity of such computer-assisted proofs into question. In practice, the chances of an error invalidating a computer-assisted proof can be reduced by incorporating redundancy and self-checks into calculations, and by developing multiple independent approaches and programs.

[edit] Undecidable statements

A statement that is neither provable nor disprovable from a set of axioms is called undecidable (from those axioms). One example is the parallel postulate, which is neither provable nor refutable from the remaining axioms of Euclidean geometry.

Mathematicians have shown there are many statements that are neither provable nor disprovable in Zermelo-Fraenkel set theory with the axiom of choice (ZFC), the standard system of set theory in mathematics (assuming that ZFC is consistent); see list of statements undecidable in ZFC.

Gödel’s (first) incompleteness theorem shows that many axiom systems of mathematical interest will have undecidable statements.

[edit] Heuristic mathematics and experimental mathematics

While early mathematicians such as Eudoxus of Cnidus did not use proofs, from Euclid to the foundational mathematics developments of the late 19th and 20th centuries, proofs were an essential part of mathematics.[13] With the increase in computing power in the 1960’s, significant work began to be done investigating mathematical objects outside of the proof-theorem framework,[14] in experimental mathematics. Early pioneers of these methods intended the work ultimately to be embedded in a classical proof-theorem framework, e.g. the early development of fractal geometry[15], which was ultimately so embedded.

[edit] Related concepts

[edit] Colloquial use of “mathematical proof”

The expression “mathematical proof” is used by lay people to refer to using mathematical methods or arguing with mathematical objects, such as numbers, to demonstrate something about everyday life, or when data used in an argument are numbers. It is sometime also used to mean a “statistical proof” (below), especially when used to argue from data.

[edit] Statistical proof using data

Main article: Statistical proof

“Statistical proof” from data refers to the application of statistics, data analysis, or Bayesian analysis to infer propositions regarding the probability of data. While using mathematical proof to establish theorems in statistics, it is usually not a mathematical proof in that the assumpions from which probability statements are derived require empirical evidence from outside mathematics to verify. In physics, in addition to statistical methods, “statistical proof” can refer to the specialized mathematical methods of physics applied to analyze data in a particle physics experiment or observational study in cosmology. “Statistical proof” may also refer to raw data or a convincing diagram involving data, such as scatter plots, when the data or diagram is adequately convincing without further anaylisis.

[edit] Inductive logic proofs and Bayesian analysis

Main articles: Inductive logic and Bayesian analysis

Proofs using inductive logic, while considered mathematical in nature, seek to establish propositions with a degree of certainty, which acts in a similar manner to probability, and may be less than one certainty. Bayesian analysis establishes assertions as to the degree of a person’s subjective belief. Inductive logic should not be confused with mathematical induction.

[edit] Proofs as mental objects

Psychologism views mathematical proofs as psychological or mental objects. Mathematician philosophers, such as Leibnitz, Frege, and Carnap, have attempted to develop a semantics for what they considered to be the language of thought, whereby standards of mathematical proof might be applied to empirical science.

[edit] Influence of mathematical proof methods outside mathematics

Philosopher-mathematicians such as Schopenhauer have attempted to formulate philosophical arguments in an axiomatic manner, whereby mathematical proof standards could be applied to argumentation in general philosophy. Other mathematician-philosophers have tried to use standards of mathematical proof and reason, without empiricism, to arrive at statements outside of mathematics, but having the certainty of propositions deduced in a mathematical proof, such as Descarte’s cogito argument. Kant and Frege considered mathematical proof to be analytic apriori.

[edit] Ending a proof

Main article: Q.E.D.

Sometimes, the abbreviation “Q.E.D.” is written to indicate the end of a proof. This abbreviation stands for “Quod Erat Demonstrandum”, which is Latin for “that which was to be demonstrated”. A more common alternative is to use a square or a rectangle, such as or , known as a “tombstone” or “halmos”. Often, “which was to be shown” is verbally stated when writing “QED”, ““, or “” in an oral presentation on a board.

[edit] See also

[edit] References

  1. ^ a b Cupillari, Antonella. The Nuts and Bolts of Proofs. Academic Press, 2001. Page 3.
  2. ^ a b The History and Concept of Mathematical Proof, Steven G. Krantz. 1. February 5, 2007
  3. ^ The Emergence of Probability, Ian Hacking
  4. ^ Buss, 1997, p. 3
  5. ^ Cupillari, page 20.
  6. ^ Cupillari, page 46.
  7. ^ Proof by induction, University of Warwick Glossary of Mathematical Terminology
  8. ^ While most mathematicians do not think that probabilistic evidence ever counts as a genuine mathematical proof, a few mathematicians and philosophers have argued that at least some types of probabilistic evidence (such as Rabin’s probabilistic algorithm for testing primality) are as good as genuine mathematical proofs. See, for example, Davis, Philip J. (1972), “Fidelity in Mathematical Discourse: Is One and One Really Two?” American Mathematical Monthly 79:252-63. Fallis, Don (1997), “The Epistemic Status of Probabilistic Proof.” Journal of Philosophy 94:165-86.
  9. ^ Patricio G. Herbst, Establishing a Custom of Proving in American School Geometry: Evolution of the Two-Column Proof in the Early Twentieth Century, Educational Studies in Mathematics, Vol. 49, No. 3 (2002), pp. 283-312,
  10. ^ “in number theory and commutative algebra… in particular the statistical proof of the lemma.” [1]
  11. ^ “Whether constant π (i.e., pi) is normal is a confusing problem without any strict theoretical demonstration except for some statistical proof”” (Derogatory use.)[2]
  12. ^ “these observations suggest a statistical proof of Goldbach’s conjecture with very quickly vanishing probability of failure for large E” [3]
  13. ^What to do with the pictures? Two thoughts surfaced: the first was that they were unpublishable in the standard way, there were no theorems only very suggestive pictures. They furnished convincing evidence for many conjectures and lures to further exploration, but theorems were coins of the realm ant the conventions of that day dictated that journals only published theorems“, David Mumford, Caroline Series and David Wright, Indra’s Pearls, 2002
  14. ^Mandelbrot, working at the IBM Research Laboratory, did some computer simulations for these sets on the reasonable assumption that, if you wanted to prove something, it might be helpful to know the answer ahead of time.A Note on the History of Fractals,
  15. ^… brought home again to Benoit [Mandelbrot] that there was a ‘mathematics of the eye’, that visualization of a problem was as valid a method as any for finding a solution. Amazingly, he found himself alone with this conjecture. The teaching of mathematics in France was dominated by a handful of dogmatic mathematicians hiding behind the pseudonym ‘Bourabki’… “, Introducing Fractal Geometry, Nigel Lesmoir-Gordon

[edit] Sources

[edit] External links


v  d  e


Portal ·Category ·WikiProject ·Logic stubs ·Mathlogic stubs ·Cleanup ·Talk ·changes


History and core topics

Core topics


Key concepts and logics

of logic
Other non





Key figures

Alfarabi · Algazel · Alkindus · Al-Razi · Aristotle · Averroes · Avicenna · Boole · Cantor · Carnap · Church · Dharmakirti · Dignāga · Frege · Gentzen · Kanada · Gödel · Gotama · Hilbert · Ibn al-Nafis · Ibn Hazm · Ibn Taymiyyah · Kripke · Leibniz · Mozi · Nagarjuna · Pāṇini · Peano · Peirce · Putnam · Quine · Russell · Skolem · Suhrawardi · Tarski · Turing · Whitehead · Zadeh



Outline of logic · General · Basic · Mathematical logic · Boolean algebra · Set theory


Read Full Post »

Mathematical logic

Mathematical logic

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Mathematical logic is a subfield of mathematics with close connections to computer science and philosophical logic.[1] The field includes the mathematical study of logic and the applications of formal logic to other areas of mathematics. The unifying themes in mathematical logic include the study of the expressive power of formal systems and the deductive power of formal proof systems.

Mathematical logic is often divided into the subfields of set theory, model theory, recursion theory, and proof theory and constructive mathematics. These areas share basic results on logic, particularly first-order logic, and definability.

Since its inception, mathematical logic has contributed to, and has been motivated by, the study of foundations of mathematics. This study began in the late 19th century with the development of axiomatic frameworks for geometry, arithmetic, and analysis. In the early 20th century it was shaped by David Hilbert‘s program to prove the consistency of foundational theories. Results of Kurt Gödel, Gerhard Gentzen, and others provided partial resolution to the program, and clarified the issues involved in proving consistency. Work in set theory showed that almost all ordinary mathematics can be formalized in terms of sets, although there are some theorems that cannot be proven in common axiom systems for set theory. Contemporary work in the foundations of mathematics often focuses on establishing which parts of mathematics can be formalized in particular formal systems, rather than trying to find theories in which all of mathematics can be developed.



[edit] History

Mathematical logic emerged in the mid-19th century as a subfield of mathematics independent of the traditional study of logic (Ferreirós 2001, p. 443). Before this emergence, logic was studied with rhetoric, through the syllogism, and with philosophy. The first half of the 20th century saw an explosion of fundamental results, accompanied by vigorous debate over the foundations of mathematics.

[edit] Early history

Further information: History of logic

Sophisticated theories of logic were developed in many cultures, including China, India, Greece and the Islamic world. In the 18th century, attempts to treat the operations of formal logic in a symbolic or algebraic way had been made by philosophical mathematicians including Leibniz and Lambert, but their labors remained isolated and little known.

[edit] 19th century

In the middle of the nineteenth century, George Boole and then Augustus De Morgan presented systematic mathematical treatments of logic. Their work, building on work by algebraists such as George Peacock, extended the traditional Aristotelian doctrine of logic into a sufficient framework for the study of foundations of mathematics (Katz 1998, p. 686).

Charles Sanders Peirce built upon the work of Boole to develop a logical system for relations and quantifiers, which he published in several papers from 1870 to 1885. Gottlob Frege presented an independent development of logic with quantifiers in his Begriffsschrift, published in 1879. Frege’s work remained obscure, however, until Bertrand Russell began to promote it near the turn of the century. The two-dimensional notation Frege developed was never widely adopted and is unused in contemporary texts.

From 1890 to 1905, Ernst Schröder published Vorlesungen über die Algebra der Logik in three volumes. This work summarized and extended the work of Boole, De Morgan, and Peirce, and was a comprehensive reference to symbolic logic as it was understood at the end of the 19th century.

[edit] Foundational theories

Some concerns that mathematics had not been built on a proper foundation led to the development of axiomatic systems for fundamental areas of mathematics such as arithmetic, analysis, and geometry.

In logic, the term arithmetic refers to the theory of the natural numbers. Giuseppe Peano (1888) published a set of axioms for arithmetic that came to bear his name, using a variation of the logical system of Boole and Schröder but adding quantifiers. Peano was unaware of Frege’s work at the time. Around the same time Richard Dedekind showed that the natural numbers are uniquely characterized by their induction properties. Dedekind (1888) proposed a different characterization, which lacked the formal logical character of Peano’s axioms. Dedekind’s work, however, proved theorems inaccessible in Peano’s system, including the uniqueness of the set of natural numbers (up to isomorphism) and the recursive definitions of addition and multiplication from the successor function and mathematical induction.

In the mid-19th century, flaws in Euclid’s axioms for geometry became known (Katz 1998, p. 774). In addition to the independence of the parallel postulate, established by Nikolai Lobachevsky in 1826 (Lobachevsky 1840), mathematicians discovered that certain theorems taken for granted by Euclid were not in fact provable from his axioms. Among these is the theorem that a line contains at least two points, or that circles of the same radius whose centers are separated by that radius must intersect. Hilbert (1899) developed a complete set of axioms for geometry, building on previous work by Pasch (1882). The success in axiomatizing geometry motivated Hilbert to seek complete axiomatizations of other areas of mathematics, such as the natural numbers and the real line. This would prove to be a major area of research in the first half of the 20th century.

The 19th century saw great advances in the theory of real analysis, including theories of convergence of functions and Fourier series. Mathematicians such as Karl Weierstrass began to construct functions that stretched intuition, such as nowhere-differentiable continuous functions. Previous conceptions of a function as a rule for computation, or a smooth graph, were no longer adequate. Weierstrass began to advocate the arithmetization of analysis, which sought to axiomatize analysis using properties of the natural numbers. The modern “ε-δ” definition of limits and continuous functions was developed by Bolzano and Cauchy between 1817 and 1823 (Felscher 2000). In 1858, Dedekind proposed a definition of the real numbers in terms of Dedekind cuts of rational numbers (Dedekind 1872), a definition still employed in contemporary texts.

Georg Cantor developed the fundamental concepts of infinite set theory. His early results developed the theory of cardinality and proved that the reals and the natural numbers have different cardinalities (Cantor 1874). Over the next twenty years, Cantor developed a theory of transfinite numbers in a series of publications. In 1891, he published a new proof of the uncountability of the real numbers that introduced the diagonal argument, and used this method to prove Cantor’s theorem that no set can have the same cardinality as its powerset. Cantor believed that every set could be well-ordered, but was unable to produce a proof for this result, leaving it as an open problem in 1895 (Katz 1998, p. 807).

[edit] 20th century

In the early decades of the 20th century, the main areas of study were set theory and formal logic. The discovery of paradoxes in informal set theory caused some to wonder whether mathematics itself is inconsistent, and to look for proofs of consistency.

In 1900, Hilbert posed a famous list of 23 problems for the next century. The first two of these were to resolve the continuum hypothesis and prove the consistency of elementary arithmetic, respectively; the tenth was to produce a method that could decide whether a multivariate polynomial equation over the integers has a solution. Subsequent work to resolve these problems shaped the direction of mathematical logic, as did the effort to resolve Hilbert’s Entscheidungsproblem, posed in 1928. This problem asked for a procedure that would decide, given a formalized mathematical statement, whether the statement is true or false.

[edit] Set theory and paradoxes

Ernst Zermelo (1904) gave a proof that every set could be well-ordered, a result Georg Cantor had been unable to obtain. To achieve the proof, Zermelo introduced the axiom of choice, which drew heated debate and research among mathematicians and the pioneers of set theory. The immediate criticism of the method led Zermelo to publish a second exposition of his result, directly addressing criticisms of his proof (Zermelo 1908). This paper led to the general acceptance of the axiom of choice in the mathematics community.

Skepticism about the axiom of choice was reinforced by recently discovered paradoxes in naive set theory. Cesare Burali-Forti (1897) was the first to state a paradox: the Burali-Forti paradox shows that the collection of all ordinal numbers cannot form a set. Very soon thereafter, Bertrand Russell discovered Russell’s paradox in 1901, and Jules Richard (1905) discovered Richard’s paradox.

Zermelo (1908) provided the first set of axioms for set theory. These axioms, together with the additional axiom of replacement proposed by Abraham Fraenkel, are now called Zermelo–Fraenkel set theory (ZF). Zermelo’s axioms incorporated the principle of limitation of size to avoid Russell’s paradox.

In 1910, the first volume of Principia Mathematica by Russell and Alfred North Whitehead was published. This seminal work developed the theory of functions and cardinality in a completely formal framework of type theory, which Russell and Whitehead developed in an effort to avoid the paradoxes. Principia Mathematica is considered one of the most influential works of the 20th century, although the framework of type theory did not prove popular as a foundational theory for mathematics (Ferreirós 2001, p. 445).

Fraenkel (1922) proved that the axiom of choice cannot be proved from the remaining axioms of Zermelo’s set theory with urelements. Later work by Paul Cohen (1966) showed that the addition of urelements is not needed, and the axiom of choice is unprovable in ZF. Cohen’s proof developed the method of forcing, which is now an important tool for establishing independence results in set theory.

[edit] Symbolic logic

Leopold Löwenheim (1918) and Thoralf Skolem (1919) obtained the Löwenheim–Skolem theorem, which says that first-order logic cannot control the cardinalities of infinite structures. Skolem realized that this theorem would apply to first-order formalizations of set theory, and that it implies any such formalization has a countable model. This counterintuitive fact became known as Skolem’s paradox.

In his doctoral thesis, Kurt Gödel (1929) proved the completeness theorem, which establishes a correspondence between syntax and semantics in first-order logic. Gödel used the completeness theorem to prove the compactness theorem, demonstrating the finitary nature of first-order logical consequence. These results helped establish first-order logic as the dominant logic used by mathematicians.

In 1931, Gödel published On Formally Undecidable Propositions of Principia Mathematica and Related Systems, which proved the incompleteness (in a different meaning of the word) of all sufficiently strong, effective first-order theories. This result, known as Gödel’s incompleteness theorem, establishes severe limitations on axiomatic foundations for mathematics, striking a strong blow to Hilbert’s program. It showed the impossibility of providing a consistency proof of arithmetic within any formal theory of arithmetic. Hilbert, however, did not acknowledge the importance of the incompleteness theorem for some time.

Gödel’s theorem shows that a consistency proof of any sufficiently strong, effective axiom system cannot be obtained in the system itself, if the system is consistent, nor in any weaker system. This leaves open the possibility of consistency proofs that cannot be formalized within the system they consider. Gentzen (1936) proved the consistency of arithmetic using a finitistic system together with a principle of transfinite induction. Gentzen’s result introduced the ideas of cut elimination and proof-theoretic ordinals, which became key tools in proof theory. Gödel (1958) gave a different consistency proof, which reduces the consistency of classical arithmetic to that of intutitionistic arithmetic in higher types.

[edit] Beginnings of the other branches

Alfred Tarski developed the basics of model theory.

Beginning in 1935, a group of prominent mathematicians collaborated under the pseudonym Nicolas Bourbaki to publish a series of encyclopedic mathematics texts. These texts, written in an austere and axiomatic style, emphasized rigorous presentation and set-theoretic foundations. Terminology coined by these texts, such as the words bijection, injection, and surjection, and the set-theoretic foundations the texts employed, were widely adopted throughout mathematics.

The study of computability came to be known as recursion theory, because early formalizations by Gödel and Kleene relied on recursive definitions of functions.[2] When these definitions were shown equivalent to Turing’s formalization involving Turing machines, it became clear that a new concept – the computable function – had been discovered, and that this definition was robust enough to admit numerous independent characterizations. In his work on the incompleteness theorems in 1931, Gödel lacked a rigorous concept of an effective formal system; he immediately realized that the new definitions of computability could be used for this purpose, allowing him to state the incompleteness theorems in generality that could only be implied in the original paper.

Numerous results in recursion theory were obtained in the 1940s by Stephen Cole Kleene and Emil Leon Post. Kleene (1943) introduced the concepts of relative computability, foreshadowed by Turing (1939), and the arithmetical hierarchy. Kleene later generalized recursion theory to higher-order functionals. Kleene and Kreisel studied formal versions of intuitionistic mathematics, particularly in the context of proof theory.

[edit] Subfields and scope

Contemporary mathematical logic is roughly divided into four areas: set theory, model theory, recursion theory, and proof theory and constructive mathematics. Each area has a distinct focus, although many techniques and results are shared between multiple areas. The border lines between these fields, and the lines between mathematical logic and other fields of mathematics, are not always sharp. Gödel’s incompleteness theorem marks not only a milestone in recursion theory and proof theory, but has also led to Loeb’s theorem in modal logic. The method of forcing is employed in set theory, model theory, and recursion theory, as well as in the study of intuitionistic mathematics.

The mathematical field of category theory uses many formal axiomatic methods, and includes the study of categorical logic, but category theory is not ordinarily considered a subfield of mathematical logic. Because of its applicability in diverse fields of mathematics, mathematicians including Saunders Mac Lane have proposed category theory as a foundational system for mathematics, independent of set theory. These foundations use toposes, which resemble generalized models of set theory that may employ classical or nonclassical logic.

[edit] Formal logic

At its core, mathematical logic deals with mathematical concepts expressed using formal logical systems. These systems, though they differ in many details, share the common property of considering only expressions in a fixed formal language, or signature. The system of first-order logic is the most widely studied today, because of its applicability to foundations of mathematics and because of its desirable proof-theoretic properties.[3] Stronger classical logics such as second-order logic or infinitary logic are also studied, along with nonclassical logics such as intuitionistic logic.

[edit] First-order logic

Main article: First-order logic

First-order logic is a particular formal system of logic. Its syntax involves only finite expressions as well-formed formulas, while its semantics are characterized by the limitation of all quantifiers to a fixed domain of discourse.

Early results about formal logic established limitations of first-order logic. The Löwenheim–Skolem theorem (1919) showed that if a set of sentences in a countable first-order language has an infinite model then it has at least one model of each infinite cardinality. This shows that it is impossible for a set of first-order axioms to characterize the natural numbers, the real numbers, or any other infinite structure up to isomorphism. As the goal of early foundational studies was to produce axiomatic theories for all parts of mathematics, this limitation was particularly stark.

Gödel’s completeness theorem (Gödel 1929) established the equivalence between semantic and syntactic definitions of logical consequence in first-order logic. It shows that if a particular sentence is true in every model that satisfies a particular set of axioms, then there must be a finite deduction of the sentence from the axioms. The compactness theorem first appeared as a lemma in Gödel’s proof of the completeness theorem, and it took many years before logicians grasped its significance and began to apply it routinely. It says that a set of sentences has a model if and only if every finite subset has a model, or in other words that an inconsistent set of formulas must have a finite inconsistent subset. The completeness and compactness theorems allow for sophisticated analysis of logical consequence in first-order logic and the development of model theory, and they are a key reason for the prominence of first-order logic in mathematics.

Gödel’s incompleteness theorems (Gödel 1931) establish additional limits on first-order axiomatizations. The first incompleteness theorem states that no sufficiently strong, effectively given logical system can prove its own consistency unless it is actually inconsistent. Here a logical system is effectively given if it is possible to decide, given any formula in the language of the system, whether the formula is an axiom. When applied to first-order logic, the first incompleteness theorem implies that any sufficiently strong, consistent, effective first-order theory has models that are not elementarily equivalent, a stronger limitation than the one established by the Löwenheim–Skolem theorem. The second incompleteness theorem states that no sufficiently strong, consistent, effective axiom system for arithmetic can prove its own consistency, which has been interpreted to show that Hilbert’s program cannot be completed.

[edit] Other classical logics

Many logics besides first-order logic are studied. These include infinitary logics, which allow for formulas to provide an infinite amount of information, and higher-order logics, which include a portion of set theory directly in their semantics.

The most well studied infinitary logic is L_{\omega_1,\omega}. In this logic, quantifiers may only be nested to finite depths, as in first order logic, but formulas may have finite or countably infinite conjunctions and disjunctions within them. Thus, for example, it is possible to say that an object is a natural number using a formula of L_{\omega_1,\omega} such as

(x = 0) \lor (x = 1) \lor (x = 2) \lor \cdots.

Higher-order logics allow for quantification not only of elements of the domain of discourse, but subsets of the domain of discourse, sets of such subsets, and other objects of higher type. The semantics are defined so that, rather than having a separate domain for each higher-type quantifier to range over, the quantifiers instead range over all objects of the appropriate type. The logics studied before the development of first-order logic, for example Frege’s logic, had similar set-theoretic aspects. Although higher-order logics are more expressive, allowing complete axiomatizations of structures such as the natural numbers, they do not satisfy analogues of the completeness and compactness theorems from first-order logic, and are thus less amenable to proof-theoretic analysis.

[edit] Nonclassical and modal logic

Modal logics include additional modal operators, such as an operator which states that a particular formula is not only true, but necessarily true. Although modal logic is not often used to axiomatize mathematics, it has been used to study the properties of first-order provability (Solovay 1976) and set-theoretic forcing (Hamkins and Löwe 2007).

Intuitionistic logic was developed by Heyting to study Brouwer’s program of intuitionism, in which Brouwer himself avoided formalization. Intuitionistic logic specifically does not include the law of the excluded middle, which states that each sentence is either true or its negation is true. Kleene’s work with the proof theory of intuitionistic logic showed that constructive information can be recovered from intuitionistic proofs. For example, any provably total function in intuitionistic arithmetic is computable; this is not true in classical theories of arithmetic such as Peano arithmetic.

[edit] Algebraic logic

Algebraic logic uses the methods of abstract algebra to study the semantics of formal logics. A fundamental example is the use of Boolean algebras to represent truth values in classical propositional logic, and the use of Heyting algebras to represent truth values in intuitionistic propositional logic. Stronger logics, such as first-order logic and higher-order logic, are studied using more complicated algebraic structures such as cylindric algebras.

[edit] Set theory

Main article: Set theory

Set theory is the study of sets, which are abstract collections of objects. Many of the basic notions, such as ordinal and cardinal numbers, were developed informally by Cantor before formal axiomatizations of set theory were developed. The first such axiomatization, due to Zermelo (1908), was extended slightly to become Zermelo–Fraenkel set theory (ZF), which is now the most widely-used foundational theory for mathematics.

Other formalizations of set theory have been proposed, including von Neumann–Bernays–Gödel set theory (NBG), Morse–Kelley set theory (MK), and New Foundations (NF). Of these, ZF, NBG, and MK are similar in describing a cumulative hierarchy of sets. New Foundations takes a different approach; it allows objects such as the set of all sets at the cost of restrictions on its set-existence axioms. The system of Kripke-Platek set theory is closely related to generalized recursion theory.

Two famous statements in set theory are the axiom of choice and the continuum hypothesis. The axiom of choice, first stated by Zermelo (1904), was proved independent of ZF by Fraenkel (1922), but has come to be widely accepted by mathematicians. It states that given a collection of nonempty sets there is a single set C that contains exactly one element from each set in the collection. The set C is said to “choose” one element from each set in the collection. While the ability to make such a choice is considered obvious by some, since each set in the collection is nonempty, the lack of a general, concrete rule by which the choice can be made renders the axiom nonconstructive. Stefan Banach and Alfred Tarski (1924) showed that the axiom of choice can be used to decompose a solid ball into a finite number of pieces which can then be rearranged, with no scaling, to make two solid balls of the original size. This theorem, known as the Banach-Tarski paradox, is one of many counterintuitive results of the axiom of choice.

The continuum hypothesis, first proposed as a conjecture by Cantor, was listed by David Hilbert as one of his 23 problems in 1900. Gödel showed that the continuum hypothesis cannot be disproven from the axioms of Zermelo-Frankel set theory (with or without the axiom of choice), by developing the constructible universe of set theory in which the continuum hypothesis must hold. In 1963, Paul Cohen showed that the continuum hypothesis cannot be proven from the axioms of Zermelo-Frankel set theory (Cohen 1966). This independence result did not completely settle Hilbert’s question, however, as it is possible that new axioms for set theory could resolve the hypothesis. Recent work along these lines has been conducted by W. Hugh Woodin, although its importance is not yet clear (Woodin 2001).

Contemporary research in set theory includes the study of large cardinals and determinacy. Large cardinals are cardinal numbers with particular properties so strong that the existence of such cardinals cannot be proved in ZFC. The existence of the smallest large cardinal typically studied, an inaccessible cardinal, already implies the consistency of ZFC. Despite the fact that large cardinals have extremely high cardinality, their existence has many ramifications for the structure of the real line. Determinacy refers to the possible existence of winning strategies for certain two-player games (the games are said to be determined). The existence of these strategies implies structural properties of the real line and other Polish spaces.

[edit] Model theory

Main article: Model theory

Model theory studies the models of various formal theories. Here a theory is a set of formulas in a particular formal logic and signature, while a model is a structure that gives a concrete interpretation of the theory. Model theory is closely related to universal algebra and algebraic geometry, although the methods of model theory focus more on logical considerations than those fields.

The set of all models of a particular theory is called an elementary class; classical model theory seeks to determine the properties of models in a particular elementary class, or determine whether certain classes of structures form elementary classes.

The method of quantifier elimination can be used to show that definable sets in particular theories cannot be too complicated. Tarski (1948) established quantifier elimination for real-closed fields, a result which also shows the theory of the field of real numbers is decidable. (He also noted that his methods were equally applicable to algebraically closed fields of arbitrary characteristic.) A modern subfield developing from this is concerned with o-minimal structures.

Morley’s categoricity theorem, proved by Michael D. Morley (1965), states that if a first-order theory in a countable language is categorical in some uncountable cardinality, i.e. all models of this cardinality are isomorphic, then it is categorical in all uncountable cardinalities.

A trivial consequence of the continuum hypothesis is that a complete theory with less than continuum many nonisomorphic countable models can have only countably many. Vaught’s conjecture, named after Robert Lawson Vaught, says that this is true even independently of the continuum hypothesis. Many special cases of this conjecture have been established.

[edit] Recursion theory

Main article: Recursion theory

Recursion theory, also called computability theory, studies the properties of computable functions and the Turing degrees, which divide the uncomputable functions into sets which have the same level of uncomputability. Recursion theory also includes the study of generalized computability and definability. Recursion theory grew from of the work of Alonzo Church and Alan Turing in the 1930s, which was greatly extended by Kleene and Post in the 1940s.

Classical recursion theory focuses on the computability of functions from the natural numbers to the natural numbers. The fundamental results establish a robust, canonical class of computable functions with numerous independent, equivalent characterizations using Turing machines, λ calculus, and other systems. More advanced results concern the structure of the Turing degrees and the lattice of recursively enumerable sets.

Generalized recursion theory extends the ideas of recursion theory to computations that are no longer necessarily finite. It includes the study of computability in higher types as well as areas such as hyperarithmetical theory and α-recursion theory.

Contemporary research in recursion theory includes the study of applications such as algorithmic randomness and computable model theory as well as new results in pure recursion theory.

[edit] Algorithmically unsolvable problems

An important subfield of recursion theory studies algorithmic unsolvability; a problem is algorithmically unsolvable if there is no computable function which, given any instance of the problem, returns the correct answer. The first results about unsolvability, obtained independently by Church and Turing in 1936, showed that the Entscheidungsproblem is algorithmically unsolvable. Turing proved this by establishing the unsolvability of the halting problem, a result with far-ranging implications in both recursion theory and computer science.

There are many known examples of undecidable problems from ordinary mathematics. The word problem for groups was proved algorithmically unsolvable by Pyotr Sergeyevich Novikov in 1955 and independently by W. Boone in 1959. The busy beaver problem, developed by Tibor Radó in 1962, is another well-known example.

Hilbert's tenth problem asked for an algorithm to determine whether a multivariate polynomial equation with integer coefficients has a solution in the integers. Partial progress was made by Julia Robinson, Martin Davis, and Hilary Putnam. The algorithmic unsolvability of the problem was proved by Yuri Matiyasevich in 1970 (Davis 1973).

[edit] Proof theory and constructive mathematics

Main article: Proof theory

Proof theory is the study of formal proofs in various logical deduction systems. These proofs are represented as formal mathematical objects, facilitating their analysis by mathematical techniques. Several deduction systems are commonly considered, including Hilbert-style deduction systems, systems of natural deduction, and the sequent calculus developed by Gentzen.

The study of constructive mathematics, in the context of mathematical logic, includes the study of systems in non-classical logic such as intuitionistic logic, as well as the study of predicative systems. An early proponent of predicativism was Hermann Weyl, who showed it is possible to develop a large part of real analysis using only predicative methods (Weyl 1918).

Because proofs are entirely finitary, whereas truth in a structure is not, it is common for work in constructive mathematics to emphasize provability. The relationship between provability in classical (or nonconstructive) systems and provability in intuitionistic (or constructive, respectively) systems is of particular interest. Results such as the Gödel–Gentzen negative translation show that it is possible to embed (or translate) classical logic into intuitionistic logic, allowing some properties about intuitionistic proofs to be transferred back to classical proofs.

Recent developments in proof theory include the study of proof mining by Ulrich Kohlenbach and the study of proof-theoretic ordinals by Michael Rathjen.

[edit] Connections with computer science

The study of computability theory in computer science is closely related to the study of computability in mathematical logic. There is a difference of emphasis, however. Computer scientists often focus on concrete programming languages and feasible computability, while researchers in mathematical logic often focus on computability as a theoretical concept and on noncomputability.

The study of programming language semantics is related to model theory, as is program verification (in particular, model checking). The Curry-Howard isomorphism between proofs and programs relates to proof theory, especially intuitionistic logic. Formal calculi such as the lambda calculus and combinatory logic are now studied as idealized programming languages.

Computer science also contributes to mathematics by developing techniques for the automatic checking or even finding of proofs, such as automated theorem proving and logic programming.

[edit] Foundations of mathematics

In the 19th century, mathematicians became aware of logical gaps and inconsistencies in their field. It was shown that Euclid's axioms for geometry, which had been taught for centuries as an example of the axiomatic method, were incomplete. The use of infinitesimals, and the very definition of function, came into question in analysis, as pathological examples such as Weierstrass' nowhere-differentiable continuous function were discovered.

Cantor's study of arbitrary infinite sets also drew criticism. Leopold Kronecker famously stated "God made the integers; all else is the work of man," endorsing a return to the study of finite, concrete objects in mathematics. Although Kronecker's argument was carried forward by constructivists in the 20th century, the mathematical community as a whole rejected them. David Hilbert argued in favor of the study of the infinite, saying "No one shall expel us from the Paradise that Cantor has created."

Mathematicians began to search for axiom systems that could be used to formalize large parts of mathematics. In addition to removing ambiguity from previously-naive terms such as function, it was hoped that this axiomatization would allow for consistency proofs. In the 19th century, the main method of proving the consistency of a set of axioms was to provide a model for it. Thus, for example, non-Euclidean geometry can be proved consistent by defining point to mean a point on a fixed sphere and line to mean a great circle on the sphere. The resulting structure, a model of elliptic geometry, satisfies the axioms of plane geometry except the parallel postulate.

With the development of formal logic, Hilbert asked whether it would be possible to prove that an axiom system is consistent by analyzing the structure of possible proofs in the system, and showing through this analysis that it is impossible to prove a contradiction. This idea led to the study of proof theory. Moreover, Hilbert proposed that the analysis should be entirely concrete, using the term finitary to refer to the methods he would allow but not precisely defining them. This project, known as Hilbert's program, was seriously affected by Gödel's incompleteness theorems, which show that the consistency of formal theories of arithmetic cannot be established using methods formalizable in those theories. Gentzen showed that it is possible to produce a proof of the consistency of arithmetic in a finitary system augmented with axioms of transfinite induction, and the techniques he developed to so do were seminal in proof theory.

A second thread in the history of foundations of mathematics involves nonclassical logics and constructive mathematics. The study of constructive mathematics includes many different programs with various definitions of constructive. At the most accommodating end, proofs in ZF set theory that do not use the axiom of choice are called constructive by many mathematicians. More limited versions of constructivism limit themselves to natural numbers, number-theoretic functions, and sets of natural numbers (which can be used to represent real numbers, facilitating the study of mathematical analysis). A common idea is that in order to assert that a number-theoretic function exists, a concrete means of computing the values of the function must be known.

In the early 20th century, Luitzen Egbertus Jan Brouwer founded intuitionism as a philosophy of mathematics. This philosophy, poorly understood at first, stated that in order for a mathematical statement to be true to a mathematician, that person must be able to intuit the statement, to not only believe its truth but understand the reason for its truth. A consequence of this definition of truth was the rejection of the law of the excluded middle, for there are statements that, according to Brouwer, could not be claimed to be true while their negations also could not be claimed true. Brouwer's philosophy was influential, and the cause of bitter disputes among prominent mathematicians. Later, Kleene and Kreisel would study formalized versions of intuitionistic logic (Brouwer rejected formalization, and presented his work in unformalized natural language). With the advent of the BHK interpretation and Kripke models, intuitionism became easier to reconcile with classical mathematics.

[edit] See also

[edit] Notes

  1. ^ Undergraduate texts include Boolos, Burgess, and Jeffrey (2002), Enderton (2002), and Mendelson (1997). A classic graduate text by Shoenfield (2001) first appeared in 1967.
  2. ^ A detailed study of this terminology is given by Soare (1996).
  3. ^ Ferreirós (2001) surveys the rise of first-order logic over other formal logics in the early 20th century.

[edit] References

[edit] Undergraduate texts

[edit] Graduate texts

[edit] Research papers, monographs, texts, and surveys

[edit] Classical papers, texts, and collections

  • Burali-Forti, Cesare (1897), A question on transfinite numbers , reprinted in van Heijenoort 1976, pp. 104–111.
  • Dedekind, Richard (1872), Stetigkeit und irrationale Zahlen . English translation of title: "Consistency and irrational numbers".
  • Dedekind, Richard (1888), Was sind und was sollen die Zahlen?  Two English translations:
    • 1963 (1901). Essays on the Theory of Numbers. Beman, W. W., ed. and trans. Dover.
    • 1996. In From Kant to Hilbert: A Source Book in the Foundations of Mathematics, 2 vols, Ewald, William B., ed., Oxford University Press: 787–832.
  • Fraenkel, Abraham A. (1922), "Der Begriff 'definit' und die Unabhängigkeit des Auswahlsaxioms", Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-mathematische Klasse, pp. 253–257  (German), reprinted in English translation as "The notion of 'definite' and the independence of the axiom of choice", van Heijenoort 1976, pp. 284–289.
  • Gentzen, Gerhard (1936), "Die Widerspruchsfreiheit der reinen Zahlentheorie", Mathematische Annalen 112: 132–213, doi:10.1007/BF01565428 , reprinted in English translation in Gentzen's Collected works, M. E. Szabo, ed., North-Holland, Amsterdam, 1969.[specify]
  • Gödel, Kurt (1929), "Über die Vollständigkeit des Logikkalküls", doctoral dissertation, University Of Vienna . English translation of title: "Completeness of the logical calculus".
  • Gödel, Kurt (1930), "Die Vollständigkeit der Axiome des logischen Funktionen-kalküls", Monatshefte für Mathematik und Physik 37: 349–360, doi:10.1007/BF01696781 . English translation of title: "The completeness of the axioms of the calculus of logical functions".
  • Gödel, Kurt (1931), "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I", Monatshefte für Mathematik und Physik 38 (1): 173–198, doi:10.1007/BF01700692 , see On Formally Undecidable Propositions of Principia Mathematica and Related Systems for details on English translations.
  • Gödel, Kurt (1958), "Über eine bisher noch nicht benützte Erweiterung des finiten Standpunktes", Dialectica. International Journal of Philosophy 12: 280–287, doi:10.1111/j.1746-8361.1958.tb01464.x , reprinted in English translation in Gödel's Collected Works, vol II, Soloman Feferman et al., eds. Oxford University Press, 1990.[specify]
  • van Heijenoort, Jean, ed. (1967, 1976 3rd printing with corrections), From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931 (3rd ed.), Cambridge, Mass: Harvard University Press, ISBN 0-674-32449-8 (pbk.) 
  • Hilbert, David (1889), The Foundations of Geometry , republished 1980, Open Court, Chicago.
  • David, Hilbert (1929), "Probleme der Grundlegung der Mathematik", Mathematische Annalen 102: 1–9, doi:10.1007/BF01782335 . Lecture given at the International Congress of Mathematicians, 3 September 1928. Published in English translation as "The Grounding of Elementary Number Theory", in Mancosu 1998, pp. 266–273.
  • Kleene, Stephen Cole (1943), "Recursive Predicates and Quantifiers", American Mathematical Society Transactions 54 (1): 41–73, doi:10.2307/1990131 .
  • Lobachevsky, Nikolai (1840), Geometrishe Untersuchungen zur Theorie der Parellellinien  (German), reprinted in English translation as "Geometric Investigations on the Theory of Parallel Lines" in Non-Euclidean Geometry, Robert Bonola (ed.), Dover, 1955. ISBN 0486600270
  • Leopold Löwenheim (1918)[citation needed]
  • Mancosu, Paolo, ed. (1998), From Brouwer to Hilbert. The Debate on the Foundations of Mathematics in the 1920s, Oxford: Oxford University Press .
  • Peano, Giuseppe (1888), Arithmetices principia, nova methodo exposita  (Italian), excerpt reprinted in English stranslation as "The principles of arithmetic, presented by a new method", van Heijenoort 1976, pp. 83 97.
  • Richard, Jules (1905), "Les principes des mathématiques et le problème des ensembles", Revue générale des sciences pures et appliquées 16: 541  (French), reprinted in English translation as "The principles of mathematics and the problems of sets", van Heijenoort 1976, pp. 142–144.
  • Thoralf Skolem (1919)[citation needed]
  • Tarski, Alfred (1948), A decision method for elementary algebra and geometry, Santa Monica, California: RAND Corporation 
  • Turing, Alan M. (1939), "Systems of Logic Based on Ordinals", Proceedings of the London Mathematical Society 45 (2): 161–228, doi:10.1112/plms/s2-45.1.161 
  • Zermelo, Ernst (1904), "Beweis, daß jede Menge wohlgeordnet werden kann", Mathematische Annalen 59: 514–516, doi:10.1007/BF01445300  (German), reprinted in English translation as "Proof that every set can be well-ordered", van Heijenoort 1976, pp. 139–141.
  • Zermelo, Ernst (1908), "Neuer Beweis für die Möglichkeit einer Wohlordnung", Mathematische Annalen 65: 107–128, doi:10.1007/BF01450054  (German), reprinted in English translation as "A new proof of the possibility of a well-ordering", van Heijenoort 1976, pp. 183–198.

[edit] External links


v  d  e

Major fields of mathematics


v  d  e


Portal ·Category ·WikiProject ·Logic stubs ·Mathlogic stubs ·Cleanup ·Talk ·changes


History and core topics

Core topics


Key concepts and logics

of logic
Other non





Key figures

Alfarabi · Algazel · Alkindus · Al-Razi · Aristotle · Averroes · Avicenna · Boole · Cantor · Carnap · Church · Dharmakirti · Dignāga · Frege · Gentzen · Kanada · Gödel · Gotama · Hilbert · Ibn al-Nafis · Ibn Hazm · Ibn Taymiyyah · Kripke · Leibniz · Mozi · Nagarjuna · Pāṇini · Peano · Peirce · Putnam · Quine · Russell · Skolem · Suhrawardi · Tarski · Turing · Whitehead · Zadeh



Outline of logic · General · Basic · Mathematical logic · Boolean algebra · Set theory

Read Full Post »



From Wikipedia, the free encyclopedia

Jump to: navigation, search

“Syntactic” redirects here. For another meaning of the adjective, see Syntaxis.
For other uses, see Syntax (disambiguation).

This article needs additional citations for verification.
Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (May 2007)

Theoretical linguistics
Generative linguistics
Descriptive linguistics
Historical linguistics
Comparative linguistics
Corpus linguistics
Applied linguistics
Language acquisition
Language assessment
Language development
Language education
Linguistic anthropology
Cognitive linguistics
Computational linguistics
History of linguistics
List of linguists
Unsolved problems
v  d  e

In linguistics, syntax (from Ancient Greek συν- syn-, “together”, and τάξις táxis, “arrangement”) is the study of the principles and rules for constructing sentences in natural languages. In addition to referring to the discipline, the term syntax is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in “the syntax of Modern Irish.”

Modern research in syntax attempts to describe languages in terms of such rules. Many professionals in this discipline attempt to find general rules that apply to all natural languages. The term syntax is also sometimes used to refer to the rules governing the behavior of mathematical systems, such as logic, artificial formal languages, and computer programming languages.



[edit] Early history

Works on grammar were being written long before modern syntax came about; the Aṣṭādhyāyī of Pāṇini is often cited as an example of a pre-modern work that approaches the sophistication of a modern syntactic theory.[1] In the West, the school of thought that came to be known as “traditional grammar” began with the work of Dionysius Thrax.

For centuries, work in syntax was dominated by a framework known as grammaire générale, first expounded in 1660 by Antoine Arnauld in a book of the same title. This system took as its basic premise the assumption that language is a direct reflection of thought processes and therefore there is a single, most natural way to express a thought. That way, coincidentally, was exactly the way it was expressed in French.

However, in the 19th century, with the development of historical-comparative linguistics, linguists began to realize the sheer diversity of human language, and to question fundamental assumptions about the relationship between language and logic. It became apparent that there was no such thing as a most natural way to express a thought, and therefore logic could no longer be relied upon as a basis for studying the structure of language.

The Port-Royal grammar modeled the study of syntax upon that of logic (indeed, large parts of the Port-Royal Logic were copied or adapted from the Grammaire générale[2]). Syntactic categories were identified with logical ones, and all sentences were analyzed in terms of “Subject – Copula – Predicate”. Initially, this view was adopted even by the early comparative linguists such as Franz Bopp.

The central role of syntax within theoretical linguistics became clear only in the 20th century, which could reasonably be called the “century of syntactic theory” as far as linguistics is concerned. For a detailed and critical survey of the history of syntax in the last two centuries, see the monumental work by Graffi (2001).

[edit] Modern theories

There are a number of theoretical approaches to the discipline of syntax. Many linguists (e.g. Noam Chomsky) see syntax as a branch of biology, since they conceive of syntax as the study of linguistic knowledge as embodied in the human mind. Others (e.g. Gerald Gazdar) take a more Platonistic view, since they regard syntax to be the study of an abstract formal system.[3] Yet others (e.g. Joseph Greenberg) consider grammar a taxonomical device to reach broad generalizations across languages. Some of the major approaches to the discipline are listed below.

[edit] Generative grammar

Main article: Generative grammar

The hypothesis of generative grammar is that language is a structure of the human mind. The goal of generative grammar is to make a complete model of this inner language (known as i-language). This model could be used to describe all human language and to predict the grammaticality of any given utterance (that is, to predict whether the utterance would sound correct to native speakers of the language). This approach to language was pioneered by Noam Chomsky. Most generative theories (although not all of them) assume that syntax is based upon the constituent structure of sentences. Generative grammars are among the theories that focus primarily on the form of a sentence, rather than its communicative function.

Among the many generative theories of linguistics, the Chomskyan theories are:

  • Transformational Grammar (TG) (Original theory of generative syntax laid out by Chomsky in Syntactic Structures in 1957[4])
  • Government and binding theory (GB) (revised theory in the tradition of TG developed mainly by Chomsky in the 1970s and 1980s).[5]
  • The Minimalist Program (MP) (revised version of GB published by Chomsky in 1995)[6]

Other theories that find their origin in the generative paradigm are:

[edit] Categorial grammar

Categorial grammar is an approach that attributes the syntactic structure not to rules of grammar, but to the properties of the syntactic categories themselves. For example, rather than asserting that sentences are constructed by a rule that combines a noun phrase (NP) and a verb phrase (VP) (e.g. the phrase structure rule S → NP VP), in categorial grammar, such principles are embedded in the category of the head word itself. So the syntactic category for an intransitive verb is a complex formula representing the fact that the verb acts as a functor which requires an NP as an input and produces a sentence level structure as an output. This complex category is notated as (NP\S) instead of V. NP\S is read as ” a category that searches to the left (indicated by \) for a NP (the element on the left) and outputs a sentence (the element on the right)”. The category of transitive verb is defined as an element that requires two NPs (its subject and its direct object) to form a sentence. This is notated as (NP/(NP\S)) which means “a category that searches to the right (indicated by /) for an NP (the object), and generates a function (equivalent to the VP) which is (NP\S), which in turn represents a function that searches to the left for an NP and produces a sentence).

Tree-adjoining grammar is a categorial grammar that adds in partial tree structures to the categories.

[edit] Dependency grammar

Dependency grammar is a different type of approach in which structure is determined by the relations (such as grammatical relations) between a word (a head) and its dependents, rather than being based in constituent structure. For example, syntactic structure is described in terms of whether a particular noun is the subject or agent of the verb, rather than describing the relations in terms of phrases.

Some dependency-based theories of syntax:

[edit] Stochastic/probabilistic grammars/network theories

Theoretical approaches to syntax that are based upon probability theory are known as stochastic grammars. One common implementation of such an approach makes use of a neural network or connectionism. Some theories based within this approach are:

[edit] Functionalist grammars

Functionalist theories, although focused upon form, are driven by explanation based upon the function of a sentence (i.e. its communicative function). Some typical functionalist theories include:

[edit] See also

[edit] Syntactic terms

[edit] Notes

  1. ^ Fortson IV, Benjamin W. (2004). Indo-European Language and Culture: An Introduction. Blackwell. p. 186. ISBN 1-4051-0315-9 (hb); 1-4051-0316-7 (pb). “[The Aṣṭādhyāyī] is a highly precise and thorough description of the structure of Sanskrit somewhat resembling modern generative grammar…[it] remained the most advanced linguistic analysis of any kind until the twentieth century.” 
  2. ^ Arnauld, Antoine (1683). La logique (5th ed.). Paris: G. Desprez. pp. 137. http://visualiseur.bnf.fr/Visualiseur?Destination=Gallica&O=NUMM-57444. “Nous avons emprunté…ce que nous avons dit…d’un petit Livre…sous le titre de Grammaire générale. 
  3. ^ Ted Briscoe, 2 May 2001, Interview with Gerald Gazdar. Retrieved 200806-04.
  4. ^ Chomsky, Noam. 1957. Syntactic Structures. The Hague/Paris: Mouton, p. 15.
  5. ^ Chomsky, Noam (1981/1993). Lectures on Government and Binding: The Pisa Lectures. Mouton de Gruyter.
  6. ^ Chomsky, Noam (1995). The Minimalist Program. MIT Press.

[edit] References

  • Brown, Keith; Jim Miller (eds.) (1996). Concise Encyclopedia of Syntactic Theories. New York: Elsevier Science. ISBN 0-08-042711-1. 
  • Carnie, Andrew (2006). Syntax: A Generative Introduction. Oxford: Wiley-Blackwell. ISBN 1405133848. 
  • Freidin, Robert; Howard Lasnik (eds.) (2006). Syntax. Critical Concepts in Linguistics. New York: Routledge. ISBN 0-415-24672-5. 
  • Graffi, Giorgio (2001). 200 Years of Syntax. A Critical Survey. Studies in the History of the Language Sciences 98. Amsterdam: Benjamins. ISBN 90-272-4587-8. 

[edit] External links

Read Full Post »

Operator Grammar

Operator Grammar

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Operator Grammar is a mathematical theory of human language that explains how language carries information. This theory is the culmination of the life work of Zellig Harris, with major publications toward the end of the last century. Operator Grammar proposes that each human language is a self-organizing system in which both the syntactic and semantic properties of a word are established purely in relation to other words. Thus, no external system (metalanguage) is required to define the rules of a language. Instead, these rules are learned through exposure to usage and through participation, as is the case with most social behavior. The theory is consistent with the idea that language evolved gradually, with each successive generation introducing new complexity and variation.

Operator Grammar posits three universal constraints: Dependency (certain words depend on the presence of other words to form an utterance), Likelihood (some combinations of words and their dependents are more likely than others) and Reduction (words in high likelihood combinations can be reduced to shorter forms, and sometimes omitted completely). Together these provide a theory of language information: dependency builds a predication structure; likelihood creates distinct meanings; reduction allows compact forms for communication.



[edit] Dependency

The fundamental mechanism of Operator Grammar is the dependency constraint: certain words (operators) require that one or more words (arguments) be present in an utterance. In the sentence John wears boots, the operator wears requires the presence of two arguments, such as John and boots. (This definition of dependency differs from other dependency grammars in which the arguments are said to depend on the operators.)

In each language the dependency relation among words gives rise to syntactic categories in which the allowable arguments of an operator are defined in terms of their dependency requirements. Class N contains words (e.g. John, boots) that do not require the presence of other words. Class ON contains the words (e.g. sleeps) that require exactly one word of type N. Class ONN contains the words (e.g. wears) that require two words of type N. Class OOO contains the words (e.g. because) that require two words of type O, as in John stumbles because John wears boots. Other classes include OO (is possible), ONNN (put), OON (with, surprise), ONO (know), ONNO (ask) and ONOO (attribute).

The categories in Operator Grammar are universal and are defined purely in terms of how words relate to other words, and do not rely on an external set of categories such as noun, verb, adjective, adverb, preposition, conjunction, etc. The dependency properties of each word are observable through usage and therefore learnable.

[edit] Likelihood

The dependency constraint creates a structure (syntax) in which any word of the appropriate class can be an argument for a given operator. The likelihood constraint places additional restrictions on this structure by making some operator/argument combinations more likely than others. Thus, John wears hats is more likely than John wears snow which in turn is more likely than John wears vacation. The likelihood constraint creates meaning (semantics) by defining each word in terms of the words it can take as arguments, or of which it can be an argument.

Each word has a unique set of words with which it has been observed to occur called its selection. The coherent selection of a word is the set of words for which the dependency relation has above average likelihood. Words that are similar in meaning have similar coherent selection. This approach to meaning is self-organizing in that no external system is necessary to define what words mean. Instead, the meaning of the word is determined by its usage within a population of speakers. Patterns of frequent use are observable and therefore learnable. New words can be introduced at any time and defined through usage.

[edit] Reduction

The reduction constraint acts on high likelihood combinations of operators and arguments and makes more compact forms. Certain reductions allow words to be omitted completely from an utterance. For example, I expect John to come is reducible to I expect John, because to come is highly likely under expect. The sentence John wears boots and John wears hats can be reduced to John wears boots and hats because repetition of the first argument John under the operator and is highly likely. John reads things can be reduced to John reads, because the argument things has high likelihood of occurring under any operator.

Certain reductions reduce words to shorter forms, creating pronouns, suffixes and prefixes (morphology). John wears boots and John wears hats can be reduced to John wears boots and he wears hats, where the pronoun he is a reduced form of John. Suffixes and prefixes can be obtained by appending other freely occurring words, or variants of these. John is able to be liked can be reduced to John is likeable. John is thoughtful is reduced from John is full of thought, and John is anti-war from John is against war.

Modifiers are the result of several of these kinds of reductions, which give rise to adjectives, adverbs, prepositional phrases, subordinate clauses, etc.

  1. John wears boots; the boots are of leather (two sentences joined by semicolon operator) →
  2. John wears boots which are of leather (reduction of repeated noun to relative pronoun) →
  3. John wears boots of leather (omission of high likelihood phrase which are) →
  4. John wears leather boots (omission of high likelihood operator of, transposition of short modifier to left of noun)

Each language has a unique set of reductions. For example, some languages have morphology and some don’t; some transpose short modifiers and some do not. Each word in a language participates only in certain kinds of reductions. However, in each case, the reduced material can be reconstructed from knowledge of what is likely in the given operator/argument combination. The reductions in which each word participates are observable and therefore learnable, just as one learns a word’s dependency and likelihood properties.

[edit] Information

The importance of reductions in Operator Grammar is that they separate sentences that contain reduced forms from those that don’t (base sentences). All reductions are paraphrases, since they do not remove any information, just make sentences more compact. Thus, the base sentences contain all the information of the language and the reduced sentences are variants of these. Base sentences are made up of simple words without modifiers and largely without affixes, e.g. Snow falls, Sheep eat grass, John knows sheep eat grass, That sheep eat snow surprises John.

Each operator in a sentence makes a contribution in information according to its likelihood of occurrence with its arguments. Highly expected combinations have low information; rare combinations have high information. The precise contribution of an operator is determined by its selection, the set of words with which it occurs with high frequency. The arguments boots, hats, sheep, grass and snow differ in meaning according to the operators for which they can appear with high likelihood in first or second argument position. For example, snow is expected as first argument of fall but not of eat, while the reverse is true of sheep. Similarly, the operators eat, devour, chew and swallow differ in meaning to the extent that the arguments they select and the operators that select them differ.

Operator Grammar predicts that the information carried by a sentence is the accumulation of contributions of each argument and operator. The increment of information that a given word adds to a new sentences is determined by how it was used before. In turn, new usages stretch or even alter the information content associated with a word. Because this process is based on high frequency usage, the meanings of words are relatively stable over time, but can change in accordance with the needs of a linguistic community.

[edit] Bibliography

  • Harris, Zellig (1982), A Grammar of English on Mathematical Principles, New York: John Wiley and Sons, ISBN 0471029580
  • Harris, Zellig (1988), Language and Information, New York: Columbia University Press, ISBN 0-231-06662-7
  • Harris, Zellig (1989), The Form of Information in Science: Analysis of an immunology sublanguage, Springer, ISBN 90-277-2516-0
  • Harris, Zellig (1991), A Theory of Language and Information: A Mathematical Approach, Oxford University Press, USA, ISBN 0-19-824224-7

Read Full Post »

Recursive categorical syntax

From Wikipedia, the free encyclopedia

  (Redirected from Algebraic syntax)
Jump to: navigation, search

This article is in need of attention from an expert on the subject. WikiProject Linguistics or the Linguistics Portal may be able to help recruit one. (November 2008)

Recursive categorical syntax, also sometimes called algebraic syntax, is an algebraic theory of syntax developed by Michael Brame as an alternative to transformational-generative grammar. It is a type of dependency grammar, and is related to link grammars.

[edit] Definition

Brame formulated an algebra, (technically a non-associative groupoid with inverses) of lexical items (words and phrases), or lexes for short. A lex is a string representation of a word or phrase together with a string of directed types. A directed type is a symbol representing a syntactic type together with a direction (up, down, left, right) usually given by an arrow beside or above the symbol. In this article left and down arrows will be placed to the left and right and up arrows to the right of symbols.

Lexical composition of two lexes is performed by concatenating the phonetic or orthographic representations and composing the directed type strings. Thus [A, B] [C, D] = [AC, BD]. In our groupoid of directed type strings we define X→←X = X↑↓X = ←X↓X = X↑X→ = 1 for all X so that these strings “cancel.”

With these definitions we can consider the subgroupoid generated by a lexicon of primitive lexes. For example, our lexicon might contain the words [We, ←SV→], [went, ←VN→], and [home, ←N], from which we can construct [We, ←SV→] [went, ←VN→] [home, ←N] = [We went home, ←S]. Given a correct lexicon to begin with, the theory of algebraic syntax claims that the grammatical sentences will be precisely those with directed type ←S.

[edit] References

  • Brame, Michael. “Universal Word Induction vs Move α” in Linguistic Analysis, Vol. 14, No. 4, 1984.
  • Brame, Michael. “Recursive Categorical Syntax I: Semigroups, Monoids, Lattices, and Categories” in Linguistic Analysis, Vol. 14, No. 1.
  • Brame, Michael. “Recursive Categorical Syntax II: n-arity and Variable Continuation” in Linguistic Analysis, Vol. 15, No. 2-3, 1985.
  • Brame, Michael. “Recursive Categorical Syntax III: dl-Induction” in Linguistic Analysis.

Read Full Post »



From Wikipedia, the free encyclopedia

Jump to: navigation, search

Theoretical linguistics
Generative linguistics
Descriptive linguistics
Historical linguistics
Comparative linguistics
Corpus linguistics
Applied linguistics
Language acquisition
Language assessment
Language development
Language education
Linguistic anthropology
Cognitive linguistics
Computational linguistics
History of linguistics
List of linguists
Unsolved problems
v  d  e

Semantics is the study of meaning. The word “semantics” itself denotes a range of ideas, from the popular to the highly technical. It is often used in ordinary language to denote a problem of understanding that comes down to word selection or connotation. This problem of understanding has been the subject of many formal inquiries, over a long period of time. The word is derived from the Greek word σημαντικός (semantikos), “significant”,[1] from σημαίνω (semaino), “to signify, to indicate” and that from σήμα (sema), “sign, mark, token”.[2] In linguistics, it is the study of interpretation of signs or symbols as used by agents or communities within particular circumstances and contexts.[3] Within this view, sounds, facial expressions, body language, proxemics have semantic (meaningful) content, and each has several branches of study. In written language, such things as paragraph structure and punctuation have semantic content; in other forms of language, there is other semantic content.[4]

The formal study of semantics intersects with many other fields of inquiry, including proxemics, lexicology, syntax, pragmatics, etymology and others, although semantics is a well-defined field in its own right, often with synthetic properties.[5] In philosophy of language, semantics and reference are related fields. Further related fields include philology, communication, and semiotics. The formal study of semantics is therefore complex.

The word semantic in its modern sense is considered to have first appeared in French as sémantique in Michel Bréal‘s 1897 book, Essai de sémantique’. In International Scientific Vocabulary semantics is also called semasiology. The discipline of Semantics is distinct from Alfred Korzybski’s General Semantics, which is a system for looking at the semantic reactions of the whole human organism in its environment to some event, symbolic or otherwise.



[edit] Linguistics

This article may require cleanup to meet Wikipedia’s quality standards. Please improve this article if you can. (August 2008)

In linguistics, semantics is the subfield that is devoted to the study of meaning, as inherent at the levels of words, phrases, sentences, and larger units of discourse (referred to as texts). The basic area of study is the meaning of signs, and the study of relations between different linguistic units: homonymy, synonymy, antonymy, polysemy, paronyms, hypernymy, hyponymy, meronymy, metonymy, holonymy, exocentricity / endocentricity, linguistic compounds. A key concern is how meaning attaches to larger chunks of text, possibly as a result of the composition from smaller units of meaning. Traditionally, semantics has included the study of sense and denotative reference, truth conditions, argument structure, thematic roles, discourse analysis, and the linkage of all of these to syntax.

Formal semanticists are concerned with the modeling of meaning in terms of the semantics of logic. Thus the sentence John loves a bagel can be broken down into its constituents (signs), of which the unit loves may serve as both syntactic and semantic head.

In the late 1960s, Richard Montague proposed a system for defining semantic entries in the lexicon in terms of lambda calculus. In these terms, the syntactic parse of the sentence above would now indicate loves as the head, and its entry in the lexicon would point to the arguments as the agent, John, and the object, bagel, with a special role for the article “a” (which Montague called a quantifier). This resulted in the sentence being associated with the logical predicate loves (John, bagel), thus linking semantics to categorial grammar models of syntax. The logical predicate thus obtained would be elaborated further, e.g. using truth theory models, which ultimately relate meanings to a set of Tarskiian universals, which may lie outside the logic. The notion of such meaning atoms or primitives is basic to the language of thought hypothesis from the 70s.

Despite its elegance, Montague grammar was limited by the context-dependent variability in word sense, and led to several attempts at incorporating context, such as:

  • situation semantics (’80s): Truth-values are incomplete, they get assigned based on context
  • generative lexicon (’90s): categories (types) are incomplete, and get assigned based on context

[edit] The dynamic turn in semantics

In the Chomskian tradition in linguistics there was no mechanism for the learning of semantic relations, and the nativist view considered all semantic notions as inborn. Thus, even novel concepts were proposed to have been dormant in some sense. This traditional view was also unable to address many issues such as metaphor or associative meanings, and semantic change, where meanings within a linguistic community change over time, and qualia or subjective experience. Another issue not addressed by the nativist model was how perceptual cues are combined in thought, e.g. in mental rotation.[6]

This traditional view of semantics, as an innate finite meaning inherent in a lexical unit that can be composed to generate meanings for larger chunks of discourse, is now being fiercely debated in the emerging domain of cognitive linguistics[7] and also in the non-Fodorian camp in Philosophy of Language.[8] The challenge is motivated by:

  • factors internal to language, such as the problem of resolving indexical or anaphora (e.g. this x, him, last week). In these situations “context” serves as the input, but the interpreted utterance also modifies the context, so it is also the output. Thus, the interpretation is necessarily dynamic and the meaning of sentences is viewed as context change potentials instead of propositions.
  • factors external to language, i.e. language is not a set of labels stuck on things, but “a toolbox, the importance of whose elements lie in the way they function rather than their attachments to things.”[8] This view reflects the position of the later Wittgenstein and his famous game example, and is related to the positions of Quine, Davidson, and others.

A concrete example of the latter phenomenon is semantic underspecification — meanings are not complete without some elements of context. To take an example of a single word, “red”, its meaning in a phrase such as red book is similar to many other usages, and can be viewed as compositional.[9] However, the colours implied in phrases such as “red wine” (very dark), and “red hair” (coppery), or “red soil”, or “red skin” are very different. Indeed, these colours by themselves would not be called “red” by native speakers. These instances are contrastive, so “red wine” is so called only in comparison with the other kind of wine (which also is not “white” for the same reasons). This view goes back to de Saussure:

Each of a set of synonyms like redouter (‘to dread’), craindre (‘to fear’), avoir peur (‘to be afraid’) has its particular value only because they stand in contrast with one another. No word has a value that can be identified independently of what else is in its vicinity.[10]

and may go back to earlier Indian views on language, especially the Nyaya view of words as indicators and not carriers of meaning.[11]

An attempt to defend a system based on propositional meaning for semantic underspecification can be found in the Generative Lexicon model of James Pustejovsky, who extends contextual operations (based on type shifting) into the lexicon. Thus meanings are generated on the fly based on finite context.

[edit] Prototype theory

Another set of concepts related to fuzziness in semantics is based on prototypes. The work of Eleanor Rosch and George Lakoff in the 1970s led to a view that natural categories are not characterizable in terms of necessary and sufficient conditions, but are graded (fuzzy at their boundaries) and inconsistent as to the status of their constituent members.

Systems of categories are not objectively “out there” in the world but are rooted in people’s experience. These categories evolve as learned concepts of the world — meaning is not an objective truth, but a subjective construct, learned from experience, and language arises out of the “grounding of our conceptual systems in shared embodiment and bodily experience”.[12] A corollary of this is that the conceptual categories (i.e. the lexicon) will not be identical for different cultures, or indeed, for every individual in the same culture. This leads to another debate (see the Whorf-Sapir hypothesis or Eskimo words for snow).

English nouns are found by language analysis to have 25 different semantic features, each associated with its own pattern of fMRI brain activity. The individual contribution of each parameter predicts the fMRI pattern when nouns are considered thus supporting the view that nouns derive their meaning from prior experience linked to a common symbol.[13]

[edit] Computer science

In computer science, where it is considered as an application of mathematical logic, semantics reflects the meaning of programs or functions.

In this regard, semantics permits programs to be separated into their syntactical part (grammatical structure) and their semantic part (meaning). For instance, the following statements use different syntaxes (languages), but result in the same semantic:

Generally these operations would all perform an arithmetical addition of ‘y’ to ‘x’ and store the result in a variable called ‘x’.

Semantics for computer applications falls into three categories:[14]

  • Operational semantics: The meaning of a construct is specified by the computation it induces when it is executed on a machine. In particular, it is of interest how the effect of a computation is produced.
  • Denotational semantics: Meanings are modelled by mathematical objects that represent the effect of executing the constructs. Thus only the effect is of interest, not how it is obtained.
  • Axiomatic semantics: Specific properties of the effect of executing the constructs as expressed as assertions. Thus there may be aspects of the executions that are ignored.

The Semantic Web refers to the extension of the World Wide Web through the embedding of additional semantic metadata; s.a. Web Ontology Language (OWL).

[edit] Psychology

In psychology, semantic memory is memory for meaning, in other words, the aspect of memory that preserves only the gist, the general significance, of remembered experience, while episodic memory is memory for the ephemeral details, the individual features, or the unique particulars of experience. Word meaning is measured by the company they keep; the relationships among words themselves in a semantic network. In a network created by people analyzing their understanding of the word (such as Wordnet) the links and decomposition structures of the network are few in number and kind; and include “part of”, “kind of”, and similar links. In automated ontologies the links are computed vectors without explicit meaning. Various automated technologies are being developed to compute the meaning of words: latent semantic indexing and support vector machines as well as natural language processing, neural networks and predicate calculus techniques.

[edit] References

  1. ^ “Semantikos, Henry George Liddell, Robert Scott, A Greek-English Lexicon, at Perseus”. http://www.perseus.tufts.edu/cgi-bin/ptext?doc=Perseus%3Atext%3A1999.04.0057%3Aentry%3D%2393797. 
  2. ^ “Semaino, Henry George Liddell, Robert Scott, An Intermediate Greek-English Lexicon, at Perseus”. http://www.perseus.tufts.edu/cgi-bin/ptext?doc=Perseus%3Atext%3A1999.04.0058%3Aentry%3D%2329446. 
  3. ^ name=Neurath:1955
  4. ^ Otto Neurath (Editor), Rudolf Carnap (Editor), Charles F. W. Morris (Editor) (1955). International Encyclopedia of Unified Science. Chicago, IL: University of Chicago Press. 
  5. ^ Cruise, Alan. Meaning and Language: An introduction to Semantics and Pragmatics, chapter one, Oxford Textbooks in Linguistics, 2004; Kearns, Kate. Semantics, Palgrave MacMillan 2000; Cruise, D.A. Lexical Semantics. Cambridge, 1986.
  6. ^ Barsalou, L. (1999). Perceptual Symbol Systems. Behavioral and Brain Sciences 22(4)
  7. ^ Ronald W. Langacker (1999). Grammar and Conceptualization. Berlin/New York: Mouton de Gruyer. ISBN 3110166038. 
  8. ^ a b Jaroslav Peregrin (2003). Meaning: The Dynamic Turn. Current Research in the Semantics/Pragmatics Interface. London: Elsevier. 
  9. ^ P. Gardenfors (2000). Conceptual Spaces. Cambridge, MA: MIT Press/Bradford Books. 
  10. ^ Ferdinand de Saussure (1916). The Course of General Linguistics (Cours de linguistique générale). 
  11. ^ Bimal Krishna Matilal (1990). The word and the world: India’s contribution to the study of language. Oxford.  The Nyaya and Mimamsa schools in Indian vyakarana tradition conducted a centuries-long debate on whether sentence meaning arises through composition on word meanings, which are primary; or whether word meanings are obtained through analysis of sentences where they appear. (Chapter 8).
  12. ^ George Lakoff and Mark Johnson (1999). Philosophy in the Flesh: The embodied mind and its challenge to Western thought. Chapter 1.. New York: Basic Books.. 
  13. ^ Mitchell TM, Shinkareva S, Carlson A, Chang K, Malave V, Mason R, Just M. date=200805-08 (2008). “Predicting Human Brain Activity Associated with the Meanings of Nouns”. “Science” 320: 1191–1195. doi:10.1126/science.1152876. PMID 18511683. 
  14. ^ Nielson, Hanne Riis; Nielson, Flemming (1995), Semantics with Applications, A Formal Introduction (1st ed.), Chicester, England: John Wiley & Sons, ISBN 0-471-92980-8 .

[edit] See also

[edit] Major contributors in the field of Semantics

[edit] Linguistics and semiotics

[edit] Logic and mathematics

[edit] Computer science

[edit] External links

Look up semantics in Wiktionary, the free dictionary.
Sister projectWikimedia Commons has media related to: Semantics

Read Full Post »

« Newer Posts - Older Posts »

%d bloggers like this: