MIT and IBM Develop New Tool To Help Choose the Right Method for Evaluating AI Models

Selecting the right method gives users a more accurate picture of how their model is behaving, so they are better equipped to correctly interpret its predictions.

When machine-learning models are deployed in real-world situations, perhaps to flag potential disease in X-rays for a radiologist to review, human users need to know when to trust the model’s predictions.

But machine-learning models are so large and complex that even the scientists who design them don’t understand exactly how the models make predictions. So, they create techniques known as saliency methods that seek to explain model behavior.

With new methods being released all the time, researchers from MIT and IBM Research created a tool to help users choose the best saliency method for their particular task. They developed saliency cards, which provide standardized documentation of how a method operates, including its strengths and weaknesses and explanations to help users interpret it correctly.

They hope that, armed with this information, users can deliberately select an appropriate saliency method for both the type of machine-learning model they are using and the task that model is performing, explains co-lead author Angie Boggust, a graduate student in electrical engineering and computer science at MIT and member of the Visualization Group of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).

Interviews with AI researchers and experts from other fields revealed that the cards help people quickly conduct a side-by-side comparison of different methods and pick a task-appropriate technique. Choosing the right method gives users a more accurate picture of how their model is behaving, so they are better equipped to correctly interpret its predictions.

“Saliency cards are designed to give a quick, glanceable summary of a saliency method and also break it down into the most critical, human-centric attributes. They are really designed for everyone, from machine-learning researchers to lay users who are trying to understand which method to use and choose one for the first time,” says Boggust.

Joining Boggust on the paper are co-lead author Harini Suresh, an MIT postdoc; Hendrik Strobelt, a senior research scientist at IBM Research; John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering at MIT; and senior author Arvind Satyanarayan, associate professor of computer science at MIT who leads the Visualization Group in CSAIL. The research will be presented at the ACM Conference on Fairness, Accountability, and Transparency.

Picking the right method
The researchers have previously evaluated saliency methods using the notion of faithfulness. In this context, faithfulness captures how accurately a method reflects a model’s decision-making process.

But faithfulness is not black-and-white, Boggust explains. A method might perform well under one test of faithfulness, but fail another. With so many saliency methods, and so many possible evaluations, users often settle on a method because it is popular or a colleague has used it.

However, picking the “wrong” method can have serious consequences. For instance, one saliency method, known as integrated gradients, compares the importance of features in an image to a meaningless baseline. The features with the largest importance over the baseline are most meaningful to the model’s prediction. This method typically uses all 0s as the baseline, but if applied to images, all 0s equates to the color black.

“It will tell you that any black pixels in your image aren’t important, even if they are, because they are identical to that meaningless baseline. This could be a big deal if you are looking at X-rays since black could be meaningful to clinicians,” says Boggust.

Saliency cards can help users avoid these types of problems by summarizing how a saliency method works in terms of 10 user-focused attributes. The attributes capture the way saliency is calculated, the relationship between the saliency method and the model, and how a user perceives its outputs.

For example, one attribute is hyperparameter dependence, which measures how sensitive that saliency method is to user-specified parameters. A saliency card for integrated gradients would describe its parameters and how they affect its performance. With the card, a user could quickly see that the default parameters — a baseline of all 0s — might generate misleading results when evaluating X-rays.

The cards could also be useful for scientists by exposing gaps in the research space. For instance, the MIT researchers were unable to identify a saliency method that was computationally efficient, but could also be applied to any machine-learning model.

“Can we fill that gap? Is there a saliency method that can do both things? Or maybe these two ideas are theoretically in conflict with one another,” Boggust says.

Showing their cards
Once they had created several cards, the team conducted a user study with eight domain experts, from computer scientists to a radiologist who was unfamiliar with machine learning. During interviews, all participants said the concise descriptions helped them prioritize attributes and compare methods. And even though he was unfamiliar with machine learning, the radiologist was able to understand the cards and use them to take part in the process of choosing a saliency method, Boggust says.

The interviews also revealed a few surprises. Researchers often expect that clinicians want a method that is sharp, meaning it focuses on a particular object in a medical image. But the clinician in this study actually preferred some noise in medical images to help them attenuate uncertainty.

“As we broke it down into these different attributes and asked people, not a single person had the same priorities as anyone else in the study, even when they were in the same role,” she says.

Moving forward, the researchers want to explore some of the more under-evaluated attributes and perhaps design task-specific saliency methods. They also want to develop a better understanding of how people perceive saliency method outputs, which could lead to better visualizations. In addition, they are hosting their work on a public repository so others can provide feedback that will drive future work, Boggust says.

“We are really hopeful that these will be living documents that grow as new saliency methods and evaluations are developed. In the end, this is really just the start of a larger conversation around what the attributes of a saliency method are and how those play into different tasks,” she says.

Reference: “Saliency Cards: A Framework to Characterize and Compare Saliency Methods” by Angie Boggust, Harini Suresh, Hendrik Strobelt, John Guttag and Arvind Satyanarayan.
PDF

The research was supported, in part, by the MIT-IBM Watson AI Lab, the U.S. Air Force Research Laboratory, and the U.S. Air Force Artificial Intelligence Accelerator.

Zap Energy Unveils Innovative Method to Quantify Fusion Energy Gain

Zap Energy, in a paper published in Fusion Science and Technology, has defined its methodology for measuring and calculating the net energy gain, or Q, in sheared-flow-stabilized Z-pinch fusion plasmas. This marks a significant step towards demonstrating energy gain in fusion energy development. Credit: Zap Energy

A new paper lays out scientific methods for measuring and calculating Q in a sheared-flow-stabilized Z pinch.

Zap Energy has outlined its unique approach to measuring net energy gain, known as Q, in fusion energy development, according to a newly published study. The company’s Z-pinch fusion plasmas differ significantly from other fusion technologies, boasting plasma that is 100,000 times denser and lasting several microseconds longer.

In the race to develop fusion energy, each unique approach requires its own specialized techniques to determine net energy gain, an equation balancing energy in and out that’s known by the letter Q.

A new paper, published today (June 5) in the journal Fusion Science and Technology, establishes the company’s method of measuring and calculating Q in Zap’s sheared-flow-stabilized Z-pinch fusion plasmas. The publication will be an important part of Zap demonstrating energy gain on the way to building a commercial fusion system.

“The way we generate fusion-grade plasmas in our devices is different from other fusion technologies so this paper helps lay the groundwork for quantifying our progress,” says Uri Shumlak, Zap Energy cofounder, Chief Science Officer and lead author on the paper.

Hot enough, dense enough, for long enough — the three variables of temperature, density, and time are collectively known in fusion as the triple product. And while there are different ways to create fusion, all must scale up triple product to achieve net energy gains. Credit: Zap Energy

A distinctive approach
Like other fusion devices, Zap Energy plans to fuse hydrogen nuclei within material called plasma that must be superheated to temperatures hotter than the sun. The plasma properties can be measured to determine Q, or net energy gain, partly by calculating their triple product: how hot and how dense a plasma is, and how long it lasts.

Triple product is useful when comparing different fusion concepts, such as looking at how sheared-flow-stabilized Z-pinch devices differ from more traditional fusion devices, such as the tokamak, or other fusion approaches, and can also be used as a simplified proxy for Q.

Zap Energy creates fusion in a filament of plasma less than two feet long. The inset image is a high-speed camera photo of a plasma in Zap’s device. Credit: Zap Energy

In Zap’s case, its distinctive Z-pinch plasmas are about 100,000 times more dense than those in tokamaks and last for many microseconds. A pulsed system is being designed to create plasmas repeatedly.

Zap’s plasmas flow in a line with material at different distances from the inner-most part of the line moving at different speeds from its outer edges. This creates what’s called sheared-flow stabilization, which maintains the plasma long enough for sustained fusion reactions to occur. Sheared-flow stabilization allows Zap to confine plasmas without external magnets, but also leads to the need for uniquely suited measurements and analysis.

Measuring Q
To calculate triple product, Zap measures the temperature of the plasma, its density, and the flow velocity to determine the duration of plasma confinement. The corresponding calculation of Q is the ratio of fusion power (output) to input power and compares closely to the method used to measure gain in other magnetic confinement approaches, such as the tokamak. Inertial confinement approaches, like last year’s demonstration of Q>1 by Lawrence Livermore National Laboratory’s National Ignition Facility, produce short-lived plasmas and define Q as the ratio of fusion energy to input energy.

The main difference between power and energy is that power is the energy per unit of time. Since Zap’s plasmas are confined for timeframes that sit between traditional magnetic and inertial fusion approaches, choosing to calculate Q based on power is an important distinction.

“Publishing these technical details is very important. You can’t just drop a thermometer into a fusion plasma to see what’s happening, so instead we use a combination of direct and indirect observations that help give a picture of the conditions,” says Ben Levitt, Zap Energy Vice President of R&D. “This paper gives us a chance to make sure that other physicists agree our methodology conforms well with what’s been established over the years in the fusion community and lays out the way we plan on reporting our results in the near future.”

Z-pinch nuances
The paper includes a number of details that are specific to Zap’s fusion approach. One of the most important is accounting for the input power needed to drive the stabilizing plasma flow.

The paper also notes that for high performance pinches, it’s likely an energetic product of the fusion reactions called alpha particles will be trapped and boost fusion gain by offsetting some of the required input power.

Zap plans to correlate observations of plasma conditions with measurements of neutrons being emitted. Because neutrons are a primary product of fusion reactions, scientists would expect them to increase when fusion conditions are right and decrease when they’re not.

Zap achieved the first plasmas on its fourth-generation device, FuZE-Q, last May. R&D campaigns are now underway using FuZE-Q. The Zap team will analyze results from both FuZE-Q and its predecessor FuZE as they push toward demonstrating the first sheared-flow-stabilized Z-pinch plasmas capable of Q>1.

Reference: “Fusion Gain and Triple Product for the Sheared-Flow-Stabilized Z Pinch” 5 June 2023, Fusion Science and Technology.
DOI: 10.1080/15361055.2023.2198049

Zap Energy is building a low-cost, compact, and scalable fusion energy platform that confines and compresses plasma without the need for expensive and complex magnetic coils. Zap’s sheared-flow-stabilized Z-pinch technology provides compelling fusion economics and requires orders of magnitude less capital than conventional approaches. Zap Energy has over one hundred team members in two facilities near Seattle and is backed by leading financial and strategic investors.

Behind Galactic Bars: Webb Telescope Unlocks Secrets of Star Formation

NASA’s James Webb Space Telescope has captured a detailed image of the barred spiral galaxy NGC 5068. Part of a project to record star formation in nearby galaxies, this initiative provides significant insights into various astronomical fields. The telescope’s ability to see through gas and dust, typically hiding star formation processes, offers unique views into this crucial aspect of galactic evolution.

A delicate tracery of dust and bright star clusters threads across this image from the James Webb Space Telescope. The bright tendrils of gas and stars belong to the barred spiral galaxy NGC 5068, whose bright central bar is visible in the upper left of this image – a composite from two of Webb’s instruments. NASA Administrator Bill Nelson revealed the image on June 2 during an event with students at the Copernicus Science Centre in Warsaw, Poland.

In this image of the barred spiral galaxy NGC 5068, from the James Webb Space Telescope’s MIRI instrument, the dusty structure of the spiral galaxy and glowing bubbles of gas containing newly-formed star clusters are particularly prominent. Three asteroid trails intrude into this image, represented as tiny blue-green-red dots. Asteroids appear in astronomical images such as these because they are much closer to the telescope than the distant target. As Webb captures several images of the astronomical object, the asteroid moves, so it shows up in a slightly different place in each frame. They are a little more noticeable in images such as this one from MIRI, because many stars are not as bright in mid-infrared wavelengths as they are in near-infrared or visible light, so asteroids are easier to see next to the stars. One trail lies just below the galaxy’s bar, and two more in the bottom-left corner. Credit: ESA/Webb, NASA & CSA, J. Lee and the PHANGS-JWST Team

NGC 5068 lies around 20 million light-years from Earth in the constellation Virgo. This image of the central, bright star-forming regions of the galaxy is part of a campaign to create an astronomical treasure trove, a repository of observations of star formation in nearby galaxies. Previous gems from this collection can be seen here (IC 5332) and here (M74). These observations are particularly valuable to astronomers for two reasons. The first is because star formation underpins so many fields in astronomy, from the physics of the tenuous plasma that lies between stars to the evolution of entire galaxies. By observing the formation of stars in nearby galaxies, astronomers hope to kick-start major scientific advances with some of the first available data from Webb.

This view of the barred spiral galaxy NGC 5068, from the James Webb Space Telescope’s NIRCam instrument, is studded by the galaxy’s massive population of stars, most dense along its bright central bar, along with burning red clouds of gas illuminated by young stars within. This near-infrared image of the galaxy is filled by the enormous gathering of older stars which make up the core of NGC 5068. The keen vision of NIRCam allows astronomers to peer through the galaxy’s gas and dust to closely examine its stars. Dense and bright clouds of dust lie along the path of the spiral arms: These are H II regions, collections of hydrogen gas where new stars are forming. The young, energetic stars ionize the hydrogen around them, creating this glow represented in red. Credit: ESA/Webb, NASA & CSA, J. Lee and the PHANGS-JWST Team

The second reason is that Webb’s observations build on other studies using telescopes including the Hubble Space Telescope and ground-based observatories. Webb collected images of 19 nearby star-forming galaxies which astronomers could then combine with Hubble images of 10,000 star clusters, spectroscopic mapping of 20,000 star-forming emission nebulae from the Very Large Telescope (VLT), and observations of 12,000 dark, dense molecular clouds identified by the Atacama Large Millimeter/submillimeter Array (ALMA). These observations span the electromagnetic spectrum and give astronomers an unprecedented opportunity to piece together the minutiae of star formation.

With its ability to peer through the gas and dust enshrouding newborn stars, Webb is particularly well-suited to explore the processes governing star formation. Stars and planetary systems are born amongst swirling clouds of gas and dust that are opaque to visible-light observatories like Hubble or the VLT. The keen vision at infrared wavelengths of two of Webb’s instruments — MIRI (Mid-Infrared Instrument) and NIRCam (Near-Infrared Camera) — allowed astronomers to see right through the gargantuan clouds of dust in NGC 5068 and capture the processes of star formation as they happened. This image combines the capabilities of these two instruments, providing a truly unique look at the composition of NGC 5068.

The James Webb Space Telescope stands as the apex of space science observatories worldwide. Tasked with demystifying enigmas within our own solar system, Webb will also extend its gaze beyond, seeking to observe distant worlds orbiting other stars. In addition to this, it aims to delve into the cryptic structures and the origins of our universe, thereby facilitating a deeper understanding of our position within the cosmic expanse. The Webb project is an international endeavor spearheaded by NASA, conducted in close partnership with the European Space Agency (ESA) and the Canadian Space Agency.

Facebook Messenger now lets you play multiplayer games during video calls

Facebook just announced it is implementing multiplayer games into the video call feature within Messenger. This functionality allows you to converse with friends and family as you kick their booty in 14 currently-available titles. Trash talk is back, baby!

The video call gaming feature is available on Messenger for iOS, Android and the web, with no specialized installations required. The 14 games being showcased at launch include old favorites like Words With Friends and Mini Golf FRVR to newer titles like Card Wars and Exploding Kittens. Each game is designed to be played by as few as two people, though each title boasts differing maximum player numbers.

Each game is optimized for the service, with clearly-demarcated leader boards, and a user interface that leverages the Messenger experience. All you have to do is start a video call on Messenger, tap the group mode button, tap the “Play” icon, and then browse through the library of available games. The company has been experimenting with Messenger-enabled games for the past few years, but nothing has really stuck, so one hopes this new mode has some staying power.

The launch lineup here is relatively slim, at 14 titles, but Facebook Gaming says more free games are on the horizon later this year. To that end, the company is urging interested developers to contact their Partner Manager for details on how to add games to the platform. This news comes mere months after Meta shuttered the standalone Facebook Gaming app.

Best Local Multiplayer Games on PS4

What are the best local multiplayer games on PS4? The truth is, there are so many great options for some couch multiplayer goodness on PS4. While many titles offer great online experiences, nothing quite beats local multiplayer together with friends and family in the same room. Finding the right game can quickly turn four best mates into mortal enemies, or transform the group into a hysterical mess of laughter, screams, and joy.

Luckily, the PS4 has a plethora of such titles to play. From stone-cold classics like Rayman Legends and Shovel Knight to utterly unique hidden gems such as Cuphead, there’s a wide range of titles to play with others that’ll get everyone leaping out of their seats with excitement.

As with all these lists, though, it’s not up to us. All the games featured here are chosen and put in order by you lot. Your user ratings directly impact this page, making it an evolving reflection of your favourites. All you need to do is use the search bar below to find some PS4 local multiplayer games, then click the star and rate them accordingly.

30. Resident Evil 5 (PS4)
Resident Evil 5 (PS4)
Resident Evil 5 (PS4)
7.27 REVIEW 6/10 PROFILE
Publisher: Capcom / Developer: Capcom
Release Date: 28th Jun 2016 (USA) / 28th Jun 2016 (UK/EU)

Resident Evil 5 was divisive when it released and continues to split opinion today, but co-op shooters don’t really get much more satisfying than Chris Redfield and Sheva Alomar’s shuffle through West Africa. While the game lacked the atmosphere of its immediate predecessor, its PS4 port runs at a silky smooth framerate in 1080p, and bundles in all previously released DLC, making for a lengthy and undeniably entertaining entry in the series.

29. Street Fighter V (PS4)
Street Fighter V (PS4)
Street Fighter V (PS4)
7.31 REVIEW 9/10 PROFILE
Publisher: Capcom / Developer: Capcom
Release Date: 16th Feb 2016 (USA) / 16th Feb 2016 (UK/EU)

Improved dramatically over the years, culminating in the unmissable Street Fighter V: Arcade Edition, few games deliver multiplayer thrills and spills quite like Street Fighter V. This game has benefitted from years of updates and improvements, and has blossomed into a must-play entry in the genre – despite its day one woes. With a robust spread of single player content and one of the strongest competitive scenes on PS4, this is a total knockout.

28. Knack 2 (PS4)
Knack 2 (PS4)
Knack 2 (PS4)
7.4 REVIEW 8/10 PROFILE
Publisher: Sony Computer Entertainment / Developer: Japan Studio
Release Date: 5th Sep 2017 (USA) / 5th Sep 2017 (UK/EU)

Knack gets a lot of grief, and has been reduced to little more than memes at this point, but the sequel is a genuinely good action platformer. While the first game undoubtedly has its issues, Knack 2 is a big improvement. Expanded combat options, a bigger emphasis on platforming, and local co-op make this adventure a very enjoyable one that’s suitable for the whole family.

27. Dragon Ball XenoVerse 2 (PS4)
Dragon Ball XenoVerse 2 (PS4)
Dragon Ball XenoVerse 2 (PS4)
7.44 REVIEW 8/10 PROFILE
Publisher: Bandai Namco / Developer: Dimps
Release Date: 25th Oct 2016 (USA) / 28th Oct 2016 (UK/EU)
Available On: PS+ Extra

A mission-based action RPG aimed squarely at Dragon Ball fans, Dragon Ball XenoVerse 2 tasks you with correcting franchise history. Playing as your own custom hero, you travel to various points in the Dragon Ball timeline, helping Goku and the gang out in their iconic battles. Highly customisable combat and a near endless list of missions make XenoVerse 2 an adaptation that’s hard to put down once you’re in the grind.

26. Injustice: Gods Among Us Ultimate Edition (PS4)
Injustice: Gods Among Us Ultimate Edition (PS4)
Injustice: Gods Among Us Ultimate Edition (PS4)
7.45 REVIEW 8/10 PROFILE
Publisher: Warner Bros / Developer: NetherRealm Studios
Release Date: 12th Nov 2013 (USA) / 29th Nov 2013 (UK/EU)

A souped-up re-release which, bizarrely, Sony bought as a console exclusive to complement the PS4’s launch lineup. Injustice: Gods Among Us Ultimate Edition was, effectively, a port of the PS3’s Injustice: Gods Among Us – bundling in all of its add-on packs and extras. With a huge cast, spanning DC Comics icons like Batman and Wonder Woman through to more obscure, lesser-known names, this licensed fighter proved a worthy content-packed alternative to the traditional stalwarts of the genre, like Street Fighter.

25. Child of Light (PS4)
Child of Light (PS4)
Child of Light (PS4)
7.48 REVIEW 9/10 PROFILE
Publisher: Ubisoft / Developer: Ubisoft
Release Date: 29th Apr 2014 (USA) / 30th Apr 2014 (UK/EU)
Available On: PS+ Extra

Billed as a playable poem, Child of Light is a gorgeous RPG that combines a painterly art style with rhyming narration and an engrossing story. Still unique today, you can play this lovely game alone or with a buddy. Either way, there’s so much to enjoy here, with enjoyable combat and so many wonderful sights and sounds along the journey.

24. Borderlands: The Handsome Collection (PS4)
Borderlands: The Handsome Collection (PS4)
Borderlands: The Handsome Collection (PS4)
7.57 REVIEW 8/10 PROFILE
Publisher: 2K Games / Developer: Gearbox Software
Release Date: 24th Mar 2015 (USA) / 27th Mar 2015 (UK/EU)
Available On: PS+ Premium

If you’re after some co-op FPS action, the Borderlands games offer some top notch shooting and looting. Borderlands: The Handsome Collection is a great deal, giving you access to Borderlands 2 and Borderlands: The Pre-Sequel, along with all the accompanying DLC. With a ludicrous number of possible weapons, some great sci-fi environments to explore, and some daft storytelling to pull you along, this series is a highly entertaining way to get your co-op FPS kicks.

23. Resident Evil: Revelations 2 (PS4)
Resident Evil: Revelations 2 (PS4)
Resident Evil: Revelations 2 (PS4)
7.58 REVIEW 8/10 PROFILE
Publisher: Capcom / Developer: Capcom
Release Date: 18th Mar 2015 (USA) / 20th Mar 2015 (UK/EU)

Originally released episodically as a direct sequel to Nintendo 3DS game Resident Evil: Revelations, it’s probably fair to say that Resident Evil: Revelations 2 has largely been forgotten. But the game, starring Claire Redfield and Barry Burton’s daughter Moira has some high-points, including full co-op support. When playing solo, you’ll need to toggle between different characters to solve puzzles. There’s also a Raid Mode which features a bunch of combat gauntlets, all of which can be enjoyed with a friend locally and online.

22. Overcooked (PS4)
Overcooked (PS4)
Overcooked (PS4)
7.58 REVIEW 8/10 PROFILE
Publisher: Team17 / Developer: Ghost Town Games
Release Date: 2nd Aug 2016 (USA) / 2nd Aug 2016 (UK/EU)
Despite later versions improving upon the formula, the original Overcooked is still a wonderful multiplayer game. It kicked off a trend of simple, accessible co-op games, and it’s not hard to understand why. This cooking game may look friendly and straightforward, but if you and your friends aren’t careful, chaos can quickly take over the kitchen. Communication is key in this fantastic co-op game, and if you want a bit of spice, there are some competitive multiplayer levels too.

21. Crash Bandicoot 4: It’s About Time (PS4)
Crash Bandicoot 4: It’s About Time (PS4)
Crash Bandicoot 4: It’s About Time (PS4)
7.61 REVIEW 8/10 PROFILE
Publisher: Activision / Developer: Toys for Bob
Release Date: 2nd Oct 2020 (USA) / 2nd Oct 2020 (UK/EU)
Available On:
Crash Bandicoot 4: It’s About Time gets the marsupial back on track. This is a stylish, highly polished 3D platformer that harkens back to the gameplay of the original games while feeling fresh and modernising the controls. Throw in a lovely art style, multiple playable characters, and oodles of optional side content, and you have a super robust game that players of all ages and skill levels can enjoy.