Toggle light / dark theme

Data is the new oil, as they say, and perhaps that makes Harvard University the new Exxon. The school announced Thursday the launch of a dataset containing nearly one million public domain books that can be used for training AI models. Under the newly formed Institutional Data Initiative, the project has received funding from both Microsoft and OpenAI, and contains books scanned by Google Books that are old enough that their copyright protection has expired.

Wired in a piece on the new project says the dataset includes a wide variety of books with “classics from Shakespeare, Charles Dickens, and Dante included alongside obscure Czech math textbooks and Welsh pocket dictionaries.” As a general rule, copyright protections last for the lifetime of the author plus an additional 70 years.

Foundational language models, like ChatGPT, that behave like a verisimilitude of a real human require an immense amount of high-quality text for their training—generally the more information they ingest, the better the models perform at imitating humans and serving up knowledge. But that thirst for data has caused problems as the likes of OpenAI have hit walls on how much new information they can find—without stealing it, at least.

We will examine physicist Erwin Schrödinger’s view that consciousness is one unified entity shared by all beings and its implications for spirituality.

00:00:00
A Quantum Pioneer Contemplates Consciousness.

00:02:54
Schrödinger’s Philosophical Pursuits.

00:06:50

Researchers at the University of Sydney Nano Institute have made a significant advance in the field of molecular robotics by developing custom-designed and programmable nanostructures using DNA origami.

This innovative approach has potential across a range of applications, from targeted drug delivery systems to responsive materials and energy-efficient optical signal processing. The method uses ‘DNA origami’, so-called as it uses the natural folding power of DNA, the building blocks of human life, to create new and useful biological structures.

As a proof-of-concept, the researchers made more than 50 nanoscale objects, including a ‘nano-dinosaur’, a ‘dancing robot’ and a mini-Australia that is 150 nanometres wide, a thousand times narrower than a human hair.

Husker researchers Seunghee Kim, Karrie Weber and Hyun-Seob Song are studying the Midcontinent Rift — which runs from beneath Lake Superior through parts of Minnesota, Michigan, Wisconsin, Iowa, Nebraska and Kansas — to determine how best to access a potential store of natural hydrogen that could yield vast amounts of clean energy.

Originally published on Towards AI.

In the evolving landscape of artificial intelligence, data remains the fuel that powers innovation. But what happens when acquiring real-world data becomes challenging, expensive, or even impossible?

Enter synthetic data generation — a groundbreaking technique that leverages language models to create high-quality, realistic datasets. Consider training a language model on medical records without breaching privacy laws, or developing a customer interaction model without access to private conversation logs, or designing autonomous driving systems where collecting data on rare edge cases is nearly impossible. Synthetic data bridges gaps in data availability while maintaining the realism needed for effective AI training.

Scientists have made a satisfying and intriguing physics discovery some 16 years after it was first predicted to be a possibility: a quasiparticle (a group of particles behaving as one) that only has an effective mass when moving in one direction.

In physics, mass generally refers to a property of particles that relates to things like their energy and resistance to movement. Yet not all mass is built the same – some describes the energy of a particle at rest, for example, while mass may also take into account the energy of a particle’s motion.

In this case, the effective mass describes the quasiparticle’s response to forces, which varies depending on whether the movement through the material is up and down, or back and forth.

“So when we’re talking about mirror-image life, it’s kind of like a ‘what if’ experiment: What if we constructed life with right-handed proteins instead of left-handed proteins? Something that would be very, very similar to natural life, but doesn’t exist in nature. We call this mirror-image life or mirror life,” explained to Michael Kay, a professor of biochemistry at University of Utah’s medical school.

Some scientists like Kay are interested in the medical possibilities of mirror-image therapeutics—which Kay says holds potential for treating chronic illness in a more cost-effective way—but both he and the authors of the recently published commentary are concerned about the potential threats posed by mirror bacteria.

“Our analysis suggests that mirror bacteria could broadly evade many immune defenses of humans, animals, and plants. Chiral interactions, which are central to immune recognition and activation in multicellular organisms, would be impaired with mirror bacteria,” according to the scientists.