Chance aggregate tensor semantics — A made-up dewey decimal alternative with no total order, and no hierarchy. Just a set of scalar dimensions.
Dewey decimal classification (DDC) notation creates decimal numbers that can be ordered, produced by the digits in a sort of “table of contents” of sections and subsections. This allows books to be ordered on a shelf, and for visitors to narrow their search to sections of content.
- 100 (Philosophy & Psychology)
- 101 (Theory of philosophy)
- 110 (Metaphysics)
- 114 (Space)
A book has only one canonical DDC number.
Chance Aggregate Tensor
I introduce Chance aggregate tensor semantics (CATS) which has no notation, although tensor notation may be used.
A tensor is a multilinear object, which means it has many linear dimensions.
It can contain an infinite number of dimensions. No content will be perfectly classified and should not be expected to be.
With infinite tensor dimensions, the goal is not to fully categorize information, but to find keystone dimensions that are valuable in finding and organizing information.
Dimensions should, over time, become rigorously documented so that the general public can put content precisely where they belong with little moderation.
Curators & Librarians
A curator is an independent person who identifies and categorizes information. A librarian is a curator who identifies and categorizes information for a library
A library is a place to search through categorized information. A library’s role is to make it easy for humans and machines to find the information they’re looking for, allow information that they approve of, and contribute to the categorization of the information they allow.
A library may have its own rules on what information is allowed and what categorizations they apply to information. However, they must always make a best effort to:
- Identify content in a consistent way with other libraries
- Use known and documented dimensions wherever possible
Human or machine visitors may subscribe to content. They may be anonymous or may have preferences that should be considered by libraries or curators.
A consumer should be able to find information by selecting categories or ranges from each relevant dimension to filter the results.
An example of navigating high dimensional data can be seen at facebookresearch.github.io/hiplot