import textwrap

content = """Mozilla's "Trustworthy AI" Thinking Points:

PRIVACY: How is data collected, stored, and shared? Our personal data powers everything from traffic maps to targeted advertising. Trustworthy AI should enable people to decide how their data is used and what decisions are made with it.

FAIRNESS: We’ve seen time and again how bias shows up in computational models, data, and frameworks behind automated decision making. The values and goals of a system should be power aware and seek to minimize harm. Further, AI systems that depend on human workers should protect people from exploitation and overwork.

TRUST: People should have agency and control over their data and algorithmic outputs, especially considering the high stakes for individuals and societies. For instance, when online recommendation systems push people towards extreme, misleading content, potentially misinforming or radicalizing them.

SAFETY: AI systems can carry high risk for exploitation by bad actors. Developers need to implement strong measures to protect our data and personal security. Further, excessive energy consumption and extraction of natural resources for computing and machine learning accelerates the climate crisis.

TRANSPARENCY: Automated decisions can have huge personal impacts, yet the reasons for decisions are often opaque. We need to mandate transparency so that we can fully understand these systems and their potential for harm."""

# first we install huggingface's transformers library
%pip install transformers sentencepiece

from transformers import PegasusForConditionalGeneration, PegasusTokenizer
import torch

# Set the seed, this will help reproduce results. Changing the seed will
# generate new results
from transformers import set_seed
set_seed(248602)

# We're using the version of Pegasus specifically trained for summarization
# using the CNN/DailyMail dataset
model_name = "google/pegasus-cnn_dailymail"

# If you're following along in Colab, switch your runtime to a
# T4 GPU or other CUDA-compliant device for a speedup
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load the tokenizer
tokenizer = PegasusTokenizer.from_pretrained(model_name)

# Load the model
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(device)

# Tokenize the entire content
batch = tokenizer(content, padding="longest", return_tensors="pt").to(device)

# Generate the summary as tokens
summarized = model.generate(**batch)

# Decode the tokens back into text
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]

# Compare
def compare(original, summarized_text):
  print(f"Article text length: {len(original)}\n")
  print(textwrap.fill(summarized_text, 100))
  print()
  print(f"Summarized length: {len(summarized_text)}")

compare(content, summarized_text)

Article text length: 1427

Trustworthy AI should enable people to decide how their data is used.<n>values and goals of a system
should be power aware and seek to minimize harm.<n>People should have agency and control over their
data and algorithmic outputs.<n>Developers need to implement strong measures to protect our data and
personal security.

Summarized length: 320

set_seed(860912)

# Generate the summary as tokens, with a max_new_tokens
summarized = model.generate(**batch, max_new_tokens=800)
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]

compare(content, summarized_text)

Article text length: 1427

Trustworthy AI should enable people to decide how their data is used.<n>values and goals of a system
should be power aware and seek to minimize harm.<n>People should have agency and control over their
data and algorithmic outputs.<n>Developers need to implement strong measures to protect our data and
personal security.

Summarized length: 320

set_seed(118511)
summarized = model.generate(**batch, do_sample=True, temperature=0.8, top_k=0)
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]
compare(content, summarized_text)

Article text length: 1427

Mozilla's "Trustworthy AI" Thinking Points:.<n>People should have agency and control over their data
and algorithmic outputs.<n>Developers need to implement strong measures to protect our data.

Summarized length: 193

set_seed(108814)
summarized = model.generate(**batch, do_sample=True, temperature=1.0, top_k=0)
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]
compare(content, summarized_text)

Article text length: 1427

Mozilla's "Trustworthy AI" Thinking Points:.<n>People should have agency and control over their data
and algorithmic outputs.<n>Developers need to implement strong measures to protect our data and
personal security.<n>We need to mandate transparency so that we can fully understand these systems
and their potential for harm.

Summarized length: 325

set_seed(226012)
summarized = model.generate(**batch, do_sample=True, top_k=50)
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]
compare(content, summarized_text)

Article text length: 1427

Mozilla's "Trustworthy AI" Thinking Points look at ethical issues surrounding automated decision
making.<n>values and goals of a system should be power aware and seek to minimize harm.<n>People
should have agency and control over their data and algorithmic outputs.<n>Developers need to
implement strong measures to protect our data and personal security.

Summarized length: 355

set_seed(21420041)
summarized = model.generate(**batch, do_sample=True, top_p=0.9, top_k=50)
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]
compare(content, summarized_text)

# saving this for later.
pegasus_summarized_text = summarized_text

Article text length: 1427

Mozilla's "Trustworthy AI" Thinking Points:.<n>People should have agency and control over their data
and algorithmic outputs.<n>Developers need to implement strong measures to protect our data and
personal security.<n>We need to mandate transparency so that we can fully understand these systems
and their potential for harm.

Summarized length: 325

from transformers import BartTokenizer, BartForConditionalGeneration

set_seed(120986)
bart_model_name = "facebook/bart-large-cnn"

# Load the tokenizer
bart_tokenizer = BartTokenizer.from_pretrained(bart_model_name)

# Load the model
bart_model = BartForConditionalGeneration.from_pretrained(bart_model_name).to(device)

# Using the same parameters as Pegasus, let's try running BART

batch = bart_tokenizer(content, padding="longest", return_tensors="pt").to(device)
summarized = bart_model.generate(**batch, do_sample=True, top_p=0.5, top_k=50, max_new_tokens=500)
summarized_decoded = bart_tokenizer.batch_decode(summarized, skip_special_tokens=True)
summarized_text = summarized_decoded[0]
compare(content, summarized_text)

bart_summarized_text = summarized_text

Article text length: 1427

Mozilla's "Trustworthy AI" Thinking Points: How is data collected, stored, and shared? Our personal
data powers everything from traffic maps to targeted advertising. Trustworthy AI should enable
people to decide how their data is used and what decisions are made with it.

Summarized length: 271

import textwrap

content = """Mozilla's "Trustworthy AI" Thinking Points:

PRIVACY: How is data collected, stored, and shared? Our personal data powers everything from traffic maps to targeted advertising. Trustworthy AI should enable people to decide how their data is used and what decisions are made with it.

FAIRNESS: We’ve seen time and again how bias shows up in computational models, data, and frameworks behind automated decision making. The values and goals of a system should be power aware and seek to minimize harm. Further, AI systems that depend on human workers should protect people from exploitation and overwork.

TRUST: People should have agency and control over their data and algorithmic outputs, especially considering the high stakes for individuals and societies. For instance, when online recommendation systems push people towards extreme, misleading content, potentially misinforming or radicalizing them.

SAFETY: AI systems can carry high risk for exploitation by bad actors. Developers need to implement strong measures to protect our data and personal security. Further, excessive energy consumption and extraction of natural resources for computing and machine learning accelerates the climate crisis.

TRANSPARENCY: Automated decisions can have huge personal impacts, yet the reasons for decisions are often opaque. We need to mandate transparency so that we can fully understand these systems and their potential for harm."""

%pip install transformers sentencepiece

from transformers import set_seed
set_seed(248602)

# Loading up Pegasus and BART again
import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
from transformers import BartTokenizer, BartForConditionalGeneration
device = "cuda" if torch.cuda.is_available() else "cpu"

# summarizing using BART
set_seed(120986)
bart_model_name = "facebook/bart-large-cnn"

# Load the tokenizer
bart_tokenizer = BartTokenizer.from_pretrained(bart_model_name)

# Load the model
bart_model = BartForConditionalGeneration.from_pretrained(bart_model_name).to(device)

# fetch BART summary
batch = bart_tokenizer(content, padding="longest", return_tensors="pt").to(device)
summarized = bart_model.generate(**batch, do_sample=True, top_p=0.5, top_k=50, max_new_tokens=500)
summarized_decoded = bart_tokenizer.batch_decode(summarized, skip_special_tokens=True)
bart_summarized_text = summarized_decoded[0]
print(bart_summarized_text)

Mozilla's "Trustworthy AI" Thinking Points: How is data collected, stored, and shared? Trustworthy AI should enable people to decide how their data is used. AI systems that depend on human workers should protect people from exploitation and overwork. The values and goals of a system should be power aware and seek to minimize harm.

# summarizing using Pegasus

# We're using the version of Pegasus specifically trained for summarization
# using the CNN/DailyMail dataset
model_name = "google/pegasus-cnn_dailymail"

# Load the tokenizer
tokenizer = PegasusTokenizer.from_pretrained(model_name)

# Load the model
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(device)

# Tokenize the entire content
batch = tokenizer(content, padding="longest", return_tensors="pt").to(device)

# Generate the summary as tokens
summarized = model.generate(**batch)

# Decode the tokens back into text
summarized_decoded = tokenizer.batch_decode(summarized, skip_special_tokens=True)
pegasus_summarized_text = summarized_decoded[0]
print(pegasus_summarized_text)

Trustworthy AI should enable people to decide how their data is used.<n>values and goals of a system should be power aware and seek to minimize harm.<n>People should have agency and control over their data and algorithmic outputs.<n>Developers need to implement strong measures to protect our data and personal security.

# Let's create a human-powered reference:
reference = """
Mozilla's Trustworthy AI principles are Privacy controls over personal data,
minimizing bias and exploitation and maximizing Fairness,
ensuring data is sourced. and used appropriately leading to Trust,
Safety systems to protect from bad actors and environmental harm, and
Transparency to understand these systems in order to to reduce harm"""

%pip install rouge

from rouge import Rouge

rouge = Rouge()

# Now let's get the ROUGE scores
pegasus_scores = rouge.get_scores(pegasus_summarized_text, reference)[0]
bart_scores = rouge.get_scores(bart_summarized_text, reference)[0]

# ROUGE-1 Scores
print(f"Pegasus ROUGE-1 Scores: {pegasus_scores['rouge-1']}")
print(f"BART ROUGE-1 Scores:    {bart_scores['rouge-1']}")
print()

Pegasus ROUGE-1 Scores: {'r': 0.275, 'p': 0.275, 'f': 0.2749999950000001}
BART ROUGE-1 Scores:    {'r': 0.325, 'p': 0.29545454545454547, 'f': 0.3095238045351475}

# ROUGE-2 Scores
print(f"Pegasus ROUGE-2 Scores: {pegasus_scores['rouge-2']}")
print(f"BART ROUGE-2 Scores:    {bart_scores['rouge-2']}")
print()

Pegasus ROUGE-2 Scores: {'r': 0.0625, 'p': 0.061224489795918366, 'f': 0.06185566510362459}
BART ROUGE-2 Scores:    {'r': 0.0625, 'p': 0.05660377358490566, 'f': 0.05940593560631353}

# ROUGE-L Scores
print(f"Pegasus ROUGE-L Scores: {pegasus_scores['rouge-l']}")
print(f"BART ROUGE-L Scores:    {bart_scores['rouge-l']}")
print()

Pegasus ROUGE-L Scores: {'r': 0.2, 'p': 0.2, 'f': 0.19999999500000015}
BART ROUGE-L Scores:    {'r': 0.3, 'p': 0.2727272727272727, 'f': 0.2857142807256236}

reference = "Mozilla's 'Trustworthy AI' is built on five key principles. Privacy emphasizes user control over data collection and usage. Fairness focuses on minimizing bias in computational models, as well as protecting human workers from exploitation. Trust aims to provide individuals with control over their data and the decisions made by algorithms. Safety prioritizes protection against misuse of data, as well as reducing environmental impact. Lastly, Transparency mandates clarity in automated decision-making processes to prevent potential harm."
print(textwrap.fill(reference, 100))

Mozilla's 'Trustworthy AI' is built on five key principles. Privacy emphasizes user control over
data collection and usage. Fairness focuses on minimizing bias in computational models, as well as
protecting human workers from exploitation. Trust aims to provide individuals with control over
their data and the decisions made by algorithms. Safety prioritizes protection against misuse of
data, as well as reducing environmental impact. Lastly, Transparency mandates clarity in automated
decision-making processes to prevent potential harm.

# Now let's get the ROUGE scores
pegasus_scores = rouge.get_scores(pegasus_summarized_text, reference)[0]
bart_scores = rouge.get_scores(bart_summarized_text, reference)[0]

# ROUGE-1 Scores
print(f"Pegasus ROUGE-1 Scores: {pegasus_scores['rouge-1']}")
print(f"BART ROUGE-1 Scores:    {bart_scores['rouge-1']}")
print()

# ROUGE-2 Scores
print(f"Pegasus ROUGE-2 Scores: {pegasus_scores['rouge-2']}")
print(f"BART ROUGE-2 Scores:    {bart_scores['rouge-2']}")
print()

# ROUGE-L Scores
print(f"Pegasus ROUGE-L Scores: {pegasus_scores['rouge-l']}")
print(f"BART ROUGE-L Scores:    {bart_scores['rouge-l']}")
print()

Pegasus ROUGE-1 Scores: {'r': 0.140625, 'p': 0.225, 'f': 0.1730769183431954}
BART ROUGE-1 Scores:    {'r': 0.203125, 'p': 0.29545454545454547, 'f': 0.24074073591220863}

Pegasus ROUGE-2 Scores: {'r': 0.056338028169014086, 'p': 0.08163265306122448, 'f': 0.06666666183472257}
BART ROUGE-2 Scores:    {'r': 0.04225352112676056, 'p': 0.05660377358490566, 'f': 0.04838709187955305}

Pegasus ROUGE-L Scores: {'r': 0.140625, 'p': 0.225, 'f': 0.1730769183431954}
BART ROUGE-L Scores:    {'r': 0.203125, 'p': 0.29545454545454547, 'f': 0.24074073591220863}

%pip install bert_score

from bert_score import BERTScorer

# Let's setup BERTScorer and score the Pegasus set first
scorer = BERTScorer(lang="en", rescale_with_baseline=True)

p, r, f1 = scorer.score([pegasus_summarized_text], [reference])
print(f"Pegasus BERTSCore: 'r': {r}, 'p': {p}, 'f': {f1}")

Pegasus BERTSCore: 'r': tensor([0.2065]), 'p': tensor([0.1569]), 'f': tensor([0.1829])

p, r, f1 = scorer.score([bart_summarized_text], [reference])
print(f"BART BERTSCore: 'r': {r}, 'p': {p}, 'f': {f1}")

BART BERTSCore: 'r': tensor([0.2861]), 'p': tensor([0.3487]), 'f': tensor([0.3183])

# Let's use two NEW summary sentences to illustrate this effectively.
scorer.plot_example("Hot days forecasted during peak summer, 54 degrees.",
                    "Summer days mean high temperatures of fifty-four C")

AI Guide

Fall '23 Release

Choosing ML Models

First Steps With Language Models

How do I pick a model?¶

So... why are we not using one of the popular large language models?¶

Why does using open models matter?¶

Our First Project - Summarization¶

A brief pause for context.¶

How do I get a list of available open summarization models?¶

How do I evaluate summarization models?¶

Finding datasets¶

Evaluating models¶

Testing out a model¶

Evaluating ML Model Results

How well is this model performing anyway?¶

Using a metric to evaluate model results¶

Counter-Metrics & Why They Matter¶

Using BERTScore for evaluating summary quality¶

AI Guide

AI Guide

Fall '23 Release

Choosing ML Models

First Steps With Language Models

How do I pick a model?¶

So... why are we not using one of the popular large language models?¶

Why does using open models matter?¶

Our First Project - Summarization¶

A brief pause for context.¶

How do I get a list of available open summarization models?¶

How do I evaluate summarization models?¶

Finding datasets¶

Evaluating models¶

Testing out a model¶

Evaluating ML Model Results

How well is this model performing anyway?¶

Using a metric to evaluate model results¶

Counter-Metrics & Why They Matter¶

Using BERTScore for evaluating summary quality¶

Keep the door to AI open

Thank you

AI Guide