Skip to content

Conversation

@handysome6
Copy link

…tence

This commit implements a comprehensive improvement to the Xiaohongshu (小红书) authentication system to avoid repeated QR code scanning on every run.

Key Features

  1. Automatic Cookie Management

    • Auto-save cookies after successful QR code or phone login
    • Auto-load and validate saved cookies on subsequent runs
    • Smart fallback to configured login method if cookies are invalid
  2. New CookieManager Class

    • Manages cookie persistence to cookies/xhs_cookies.json
    • Validates cookie age (warns about 30+ day old cookies)
    • Provides cookie info and cleanup methods
  3. Enhanced Login Flow

    • Prioritizes saved cookies before prompting for login
    • Validates cookies with pong() check before use
    • Saves ALL cookies, not just web_session
    • Seamless authentication on subsequent runs
  4. New Configuration Option

    • AUTO_SAVE_AND_USE_COOKIES (default: True)
    • Enables/disables automatic cookie management
    • Backward compatible with existing configurations

Changes

New Files

  • media_platform/xhs/cookie_manager.py: Cookie persistence utility
  • docs/xiaohongshu_auth_improvement.md: English documentation
  • docs/xiaohongshu_auth_improvement_zh.md: Chinese documentation

Modified Files

  • media_platform/xhs/login.py: Enhanced cookie login, auto-save after login
  • media_platform/xhs/core.py: Improved authentication flow with cookie priority
  • config/base_config.py: Added AUTO_SAVE_AND_USE_COOKIES option
  • .gitignore: Added /cookies/ directory to prevent cookie commits

Benefits

Before: Scan QR code every single run ❌
After: Scan QR code only once, then automatic authentication ✅

Security

  • Cookie directory automatically gitignored
  • Cookies stored locally with timestamps
  • No sensitive data committed to repository

Documentation

Comprehensive documentation provided in both English and Chinese, including:

  • Configuration guide
  • Usage examples
  • Troubleshooting tips
  • API reference
  • Migration guide

Resolves the issue of repeated QR code authentication and significantly improves user experience for Xiaohongshu URL crawling.

…tence

This commit implements a comprehensive improvement to the Xiaohongshu (小红书)
authentication system to avoid repeated QR code scanning on every run.

## Key Features

1. **Automatic Cookie Management**
   - Auto-save cookies after successful QR code or phone login
   - Auto-load and validate saved cookies on subsequent runs
   - Smart fallback to configured login method if cookies are invalid

2. **New CookieManager Class**
   - Manages cookie persistence to `cookies/xhs_cookies.json`
   - Validates cookie age (warns about 30+ day old cookies)
   - Provides cookie info and cleanup methods

3. **Enhanced Login Flow**
   - Prioritizes saved cookies before prompting for login
   - Validates cookies with `pong()` check before use
   - Saves ALL cookies, not just `web_session`
   - Seamless authentication on subsequent runs

4. **New Configuration Option**
   - `AUTO_SAVE_AND_USE_COOKIES` (default: True)
   - Enables/disables automatic cookie management
   - Backward compatible with existing configurations

## Changes

### New Files
- `media_platform/xhs/cookie_manager.py`: Cookie persistence utility
- `docs/xiaohongshu_auth_improvement.md`: English documentation
- `docs/xiaohongshu_auth_improvement_zh.md`: Chinese documentation

### Modified Files
- `media_platform/xhs/login.py`: Enhanced cookie login, auto-save after login
- `media_platform/xhs/core.py`: Improved authentication flow with cookie priority
- `config/base_config.py`: Added AUTO_SAVE_AND_USE_COOKIES option
- `.gitignore`: Added `/cookies/` directory to prevent cookie commits

## Benefits

**Before**: Scan QR code every single run ❌
**After**: Scan QR code only once, then automatic authentication ✅

## Security
- Cookie directory automatically gitignored
- Cookies stored locally with timestamps
- No sensitive data committed to repository

## Documentation
Comprehensive documentation provided in both English and Chinese, including:
- Configuration guide
- Usage examples
- Troubleshooting tips
- API reference
- Migration guide

Resolves the issue of repeated QR code authentication and significantly
improves user experience for Xiaohongshu URL crawling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants