(Trigger Warning: AI-generated content below.)
TL;DR: A NotebookLM-generated Podcast about NotebookLM-generated Podcasts Copyright Issues
(I simply pasted everything below into NotebookLM, along with a custom prompt to tell the bots to skip their typical overuse of highly-annoying cliches.)
Would a podcast about a book be a copyright violation?
If I were to buy Sal Kahn’s book “Brave New Words: How AI Will Revolutionize Education” and my partner and I were to create a podcast in which we discuss one chapter per episode, would that be a copyright violation?
Here’s an example of how Google Gemini‘s new thinking model thinks before it gives you an answer:
Thoughts
The thoughts produced by the model are experimental
Here’s a thinking process to arrive at the comprehensive answer regarding the podcast and copyright:
- Identify the core question: The user wants to know if discussing a book, chapter by chapter, in a podcast constitutes copyright infringement.
- Recognize the key element: Copyright protects the expression of ideas, not the ideas themselves. This is the foundational principle.
- Consider the actions involved in the podcast: Discussing, summarizing, analyzing, quoting—these are the primary activities.
- Analyze each action in relation to copyright:
- Discussion/Analysis/Interpretation: These are generally safe as they deal with ideas and personal perspectives.
- Summarizing: Potentially problematic if too extensive and replaces the original. Need to consider the “fair use” concept.
- Quoting: Definitely subject to copyright. The key is the amount and purpose of the quote. Short, attributed quotes for criticism or analysis are more likely to be fair use. Long, unattributed quotes are problematic.
- Reading excerpts verbatim: Likely infringement unless it falls under fair use (very short, specific purpose).
- Reproducing significant portions of the text in show notes/transcripts: Potentially infringing.
- Introduce the concept of “fair use”: This is the primary defense against copyright infringement in situations like this. Explain the four factors: purpose/character, nature of the work, amount/substantiality, and effect on the market. Apply these factors specifically to the podcast scenario.
- Consider other relevant legal doctrines: Mention the idea-expression dichotomy again for reinforcement.
- Think about practical advice and best practices: This is crucial for the user.
- Attribution: Always cite the author and book.
- Limited quoting: Keep quotes concise and purposeful.
- Focus on analysis and commentary: Emphasize original thought.
- Transformative use: Explain how their podcast adds value beyond simply re-presenting the book.
- Disclaimer: Suggest a disclaimer about fair use.
- Seeking permission: Mention this as the safest route, though less likely to be pursued by the user.
- Address the “one chapter per episode” aspect: This structure in itself isn’t a copyright issue, but it could influence the amount of material discussed in each episode, making the fair use analysis for each episode important.
- Consider the commercial aspect: If the podcast is monetized, it strengthens the argument against fair use. This needs to be included.
- Provide a strong concluding recommendation: Advise seeking legal counsel for definitive answers, as copyright law is complex and fact-specific. Emphasize the importance of caution.
- Structure the answer logically: Start with a clear summary, then delve into the details, provide practical advice, and end with a cautionary note. Use headings and bullet points for clarity.
- Review and refine: Ensure the language is clear, concise, and easy to understand for someone who may not have legal expertise. Check for any ambiguities or inconsistencies. For example, initially, I might have just said “quoting is okay,” but then I refined it to emphasize short and attributed quotes.
And then it spit out its answer
Discussing Sal Kahn’s book “Brave New Words: How AI Will Revolutionize Education” chapter by chapter in a podcast likely falls under fair use and would generally not be considered copyright infringement, as long as you adhere to certain principles.
Here’s a breakdown of why and what to keep in mind:
Why it’s likely fair use:
- Purpose and Character of the Use: Your podcast is likely transformative. You’re not simply rereading the book. You’re adding commentary, analysis, discussion, and your own perspectives. This falls under criticism, commentary, scholarship, and research, all of which are factors favoring fair use.
- Nature of the Copyrighted Work: The book is a published work, and discussing its ideas is generally encouraged. While the expression is protected, the underlying ideas are not.
- Amount and Substantiality of the Portion Used: You’re discussing a chapter at a time. You’re not reproducing large portions of the book verbatim. Your focus is on your thoughts and interpretations, not the entire text itself.
- Effect of the Use Upon the Potential Market for or Value of the Copyrighted Work: Your podcast could actually increase interest in the book, potentially leading to more sales. It’s unlikely to be a substitute for reading the book itself.
However, you need to be careful and adhere to these guidelines to stay within fair use:
- Focus on Discussion and Analysis: The core of your podcast should be your own thoughts, interpretations, and discussions about the chapter’s content.
- Limit Direct Quotes: While short, attributed quotes are generally acceptable for illustrating a point, avoid excessive verbatim quoting. Don’t reproduce entire paragraphs or substantial sections of the book.
- Don’t Replace the Experience of Reading the Book: Your podcast should encourage people to read the book, not provide a substitute for it.
- Attribution is Key: Clearly and consistently attribute the book to Sal Kahn. Mention the title and publisher in each episode.
- Avoid Reproducing Significant Portions in Show Notes or Transcripts: Be cautious about including lengthy summaries or excerpts in your show notes or transcripts. Keep them focused on your discussion points.
- Consider Your Intent: If your primary goal is to help people understand the book and engage with its ideas, rather than to profit directly from reproducing its content, fair use is more likely to apply.
- Commercial vs. Non-Commercial Use: While fair use can apply to commercial ventures, it’s a stronger argument if your podcast is non-commercial (e.g., no advertising or paid subscriptions). If you intend to monetize the podcast, be extra cautious about the amount of quoting and reliance on the book’s exact wording.
Things that could get you into trouble:
- Reading entire chapters verbatim: This would clearly infringe on copyright.
- Reproducing large sections of the book in your show notes or transcripts.
- Claiming the book’s ideas as your own.
- Creating a podcast that essentially is the audiobook, just read by you.
- Monetizing the podcast heavily while relying extensively on the book’s copyrighted content.
In summary:
Creating a podcast discussing Sal Kahn’s book chapter by chapter, with a focus on your analysis and commentary, is likely protected under fair use. However, be mindful of the amount of direct quoting, ensure proper attribution, and avoid creating a substitute for reading the book itself.
Disclaimer: I am an AI and cannot provide legal advice. If you have specific concerns, it’s always best to consult with an intellectual property lawyer. However, based on general principles of copyright law and fair use, your proposed podcast is likely permissible as long as you follow the guidelines mentioned above.
Next Question: What about letting NotebookLM do it?
OK, next question: What If I upload the entire book to Google NotebookLM and use this custom prompt to create podcast episodes voiced by two AI-generated hosts that discuss the book:
“The source is “Brave new words : how AI will revolutionize education” by Salman Khan. Discuss only what’s in Part 1, which covers: Part I | Rise of the AI Tutor, Throwing Away the Bottle, How to Teach Everything to Everyone,Rise of the AI Tutor”
.. and I do that for parts 1 through 11 of the book and I use the generated audio files to create a podcast that’s not monetized? Would that be a copyright violation?
Thoughts
The thoughts produced by the model are experimental
Here’s a breakdown of the thinking process to arrive at the answer regarding AI-generated podcast episodes based on uploading the book to NotebookLM:
- Identify the Core Issue: The central question is whether using AI to generate discussions based on the book constitutes copyright infringement.
- Recall Copyright Principles: Copyright protects the expression of an idea, not the idea itself. This is a fundamental concept.
- Analyze the Action: The user is uploading the entire book into NotebookLM. This is a direct reproduction of the copyrighted work.
- Consider the Output: The AI is generating summaries, discussions, and potentially even paraphrasing or slightly rewriting the text. This derivative work is based directly on the copyrighted material.
- Apply the Derivative Work Concept: Creating a derivative work based on a copyrighted work requires permission from the copyright holder.
- Assess Fair Use (Again): While the output might resemble discussion, the input and process are significantly different from simply reading and discussing.
- Purpose and Character: While non-commercial, the purpose leans heavily towards creating a derivative work of the book, not truly about it in the same transformative way as human discussion. The AI is essentially summarizing and reorganizing.
- Nature of the Copyrighted Work: Still a published work, but the method of interaction is the key difference here.
- Amount and Substantiality: Using the entire book as input weighs heavily against fair use. It’s not using small excerpts for commentary; it’s the whole thing.
- Effect on the Market: Even non-commercial, this could potentially serve as a substitute for reading the book, particularly if the AI does a good job summarizing.
- Focus on the Method of Creation: The use of AI as an intermediary doesn’t magically erase the copyright implications of using the entire book as its source material. The AI is a tool facilitating the reproduction and transformation of the copyrighted work.
- Consider the “Who Owns the Output?” Question (Secondary, but relevant): While not the primary focus of the copyright question related to the book, it’s worth noting that the AI-generated content raises questions of ownership itself, although this is less of a concern for the user in terms of their infringement of the book’s copyright.
- Distinguish from the Previous Scenario: Highlight the crucial difference between human discussion (potentially fair use) and AI-generated content based on uploading the entire work (likely infringement).
- Formulate a Clear Conclusion: State definitively that this scenario is more likely to be a copyright violation than the previous one.
- Reinforce Key Principles: Emphasize the importance of permission and the distinction between discussing ideas and reproducing expression.
- Maintain the Disclaimer: Reiterate that this is not legal advice and professional counsel is recommended.
Its Answer
Uploading the entire book to Google NotebookLM and using AI to generate discussions based on specific parts of the book, even for a non-monetized podcast, is significantly more likely to be considered copyright infringement than the scenario where you and your partner discuss the book yourselves.
Here’s why:
- Reproduction of the Entire Work: You are creating a digital copy of the entire copyrighted work within NotebookLM. This is a direct act of reproduction, one of the core rights held by the copyright holder.
- Creating Derivative Works: The AI is essentially creating summaries, interpretations, and dialogues based directly on the book’s content. These AI-generated podcast episodes are derivative works of the original book. Copyright law grants the copyright holder the exclusive right to create derivative works.
- Lack of Human Interpretation and Transformation: While the AI generates a discussion format, it’s fundamentally based on the text you provided. The transformative element, which is crucial for fair use, is significantly diminished compared to human discussion where personal opinions, experiences, and analyses are primary. The AI is acting as an automated summarizer and rephraser.
- Scale and Scope: Generating discussions for the entire book, part by part, amplifies the infringement. It’s not just discussing a few concepts; it’s creating a comprehensive, albeit AI-generated, alternative way to engage with the book’s content.
- The Nature of AI-Generated Content: While the legal landscape around AI-generated content is still evolving, the act of feeding an entire copyrighted work into an AI to generate new content based on it is generally viewed as problematic from a copyright perspective.
Distinguishing from the Previous Scenario:
The key difference is the level of human input and transformation. In the first scenario, you and your partner are engaging with the book, offering your own unique perspectives and analyses. The book serves as a springboard for your own creative expression.
In the second scenario, the AI is the primary interpreter and communicator of the book’s content. You are essentially using AI to automate the creation of content derived directly from the copyrighted work.
Why non-monetization doesn’t automatically make it okay:
While non-commercial use is a factor considered in fair use analysis, it’s not a guaranteed shield against copyright infringement. The other factors, particularly the nature of the use and the amount and substantiality of the portion used, weigh heavily against you in this scenario. Even without making money, you are still potentially harming the market for the book by providing an alternative way to experience its content.
Potential Risks:
- Copyright Infringement Lawsuit: Sal Kahn or his publisher could potentially sue you for copyright infringement.
- Takedown Notices: Platforms hosting your podcast (like Apple Podcasts, Spotify, etc.) could receive takedown notices and remove your content.
In summary, while the first scenario of human discussion was likely within fair use, using AI to generate podcast episodes based on uploading the entire book to NotebookLM is highly likely to be considered copyright infringement. You are essentially creating an AI-generated audio version of the book, broken down by parts, which infringes on the copyright holder’s rights to reproduction and the creation of derivative works.
Recommendation:
To avoid copyright issues, you should stick to the original plan of you and your partner discussing the book. If you want to incorporate AI, you could use it as a tool for research or outlining, but the core discussion should be driven by your own interpretations and thoughts.
Disclaimer: I am an AI and cannot provide legal advice. This information is for educational purposes only. If you have serious concerns about copyright, consult with an intellectual property lawyer.
Leave a Reply