Projects
Games
libretro-picodrive
Sign Up
Log In
Username
Password
We truncated the diff of some files because they were too big. If you want to see the full diff for every file,
click here
.
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
Expand all
Collapse all
Changes of Revision 2
View file
libretro-picodrive.changes
Changed
@@ -1,4 +1,204 @@ ------------------------------------------------------------------- +Sun Aug 09 10:56:09 UTC 2020 - i@guoyunhe.me + +- Update to version 0~git20200716: + * libretro, build fixes ffor android/ios + * libretro, tentative fix for android build + * sh2 drc, optimize standard division insns (default off, needs more scrutiny) + * Buildfix + * Buildfix + * Fix more conflicting types for prototypes + * Prevent collission with PS2 SDK + * Make sure function prototype signatures match, and put typedefs into separate header file + * core, keep offsets header from being build if no preprocessed asm files + * libretro, build fixes + * libretro, build fixes + * core, fix type issues by using stdint types + * libretro, build fixes + * sh2, fix for interpreter crash if drc is compiled in too + * sh2 drc, fix for x86_64 backend + * libretro, more fixes and cleanups for windows and osx + * libretro, fix for windows and osx + * libretro, changes to allow for both standalone and libretro build + * SDL UI, 2x overlay mode, for improved color resolution + * libretro make fix for non-arm architectures + * sh2 drc, fix for SH2 T handling in Mips/RiscV + * SDL UI, preparation for 2x mode, for improved color resolution + * sh2 drc, optimisation for SH2 16x16 multiplication + * SDL UI, fix for CD LED display + * sh2 drc, backend 32/64 bit compatibility fixes for Mips/RiscV + * vdp fifo, DMA bugfix + * sh2 drc, add powerpc64le backend + * sh2 drc, preparations for powerpc support + * vdp rendering, bugfix for overlapping high prio sprites + * release 1.96 + * add copyright stuff to substantially changed files + * sh2 drc: revised ARM A32 backend optimizer + * 32x: libretro bugfix + * sh2: optimisations in drc + * audio: fix for save/load + * sh2: bugfix in drc + * audio: SN76496 fixes + * 32x poll detection fix + * vdp fifo, bugfix + * audio: add option to switch off SSG-EG + * audio: fixes and optimizations for SSG-EG + * audio: improve cycle accuracy of SN76496+YM2612 + * 32x, small improvement for poll detection + * sh2, optimizations to innermost run loop + * add sh2 ubc area to poll detection + * 32x pwm, tiny optimization + * sh2 timer optimization + * menu background fix for pal mode + * ym2612 ARM optimisations + * vdp rendering, sprite caching optimization + * ym2612 ARM, bug fixing and small optimizations + * vdp DMA optimizations + * fix for gp2x audio regression + * vdp fifo speed optimization + * fix for 68K cycle accounting + * vdp rendering fixes + * vdp rendering, fix for CD (sprites from WORD RAM) + * ARM asm, symbol visibility fix + * vdp rendering fixes (debug register, vscroll) for overdrive 2 + * vdp fifo speed optimization + * hvcounter table resolution reduced + * vdp rendering improvements + * vdp rendering, tiny improvement + * 32x, small improvement for poll detector + * vdp, some small improvements + * fix config file parsing for long filenames + * arm asm sprite rendering: add line accidently deleted in ea431e9 + * ARM SVP drc revived + * vdp sprite rendering fixes + * more ARM asm sprite rendering bugfixes + * improved hi prio sprite rendering speed + * vdp, tentative fix for save/load compatibility + * fix for VINT while DMA is running + * fix for EI insn in cz80 (partial revert of 43e1401) + * bugfix for ARM asm sprite rendering + * vdp fifo, refined timing + * vdp sprite rendering fix + * vdp fifo, another revision + * vdp sprite handling improvement (SAT cache) + * vdp fifo, tentative fix for broken save/load + * vdp rendering fixes + * 32X poll detection fix + * fix compatibility with ancient gas + * vdp fifo: kludge for DMA fill interrupted by CPU + * sh2 drc: fix for crash in generated code on x86_64 + * revised VDP fifo implementation + * new hvcounter tables as per spritesmind.net threads + * regression fix for gp2x 8bit fast mode + * improved VRAM128K support (overdrive 2) + * VDP timing improvements + * added debug reg sprite plane support (fixes some issues in overdrive 2 demo) + * sprite rendering improvements for masking and limit edge cases + * audio fixes for overdrive demo + * emulator timing fixes, VDP DMA fixes, improved DAC audio + * bug fixes in drc, audio, display + * audio: added SSG-EG to YM2612, plus some timing changes for SN76496+YM2612 + * add DC filter to sound mixer to remove potential PCM DC offset + * sh2 drc: updates from mame for ym2612 sound + * sh2 drc: optimize T bit handling for A64 + * sh2 drc: fix speed regression + * sh2 drc: cleanup, fix for drc crash, for mips code emitter + * remove textrels with -fPIC/-fPIE (for android/ios) + * sh2 drc, tentative MIPS32/64 Release 2 support + * release 1.95 + * sh2 drc: bug fixing + * sh2 drc: fixed some RISC-V bugs + * sh2 drc, small improvements and bug fixes for code emitters + * sh2 drc, improved memory management + * sh2 drc: RISC-V (RV64IM) code emitter, some work on MIPS64 + * sh2 drc: RISC-V (RV64IM) code emitter, some work on MIPS64 + * sh2 drc: optimizations for MIPS code emitting + * sh2 drc: moved host register assignment to code emitters, minor bugfixing + * 32x, finetuning + * fix gp2x regression + * sh2 drc: reorganised block mgmt code, plus some small scale optimisations + * sh2 drc: bugfix in block management + * sh2 drc: bugfix in block management + * sh2 drc bugfix for aarch64/mips + * 32x, improved auto frame skip, plus new config option for max auto skip + * 32x, configurable pwm irq optimization to reduce pwm irq load + * 32x, speed improvement + * sh2 drc: speed optimization and bugfixing + * sh2 drc: fix i386 regression + * sh2 drc: bug fixing and optimization in register cache and branch handling + * sh2 drc: drc exit, block linking and branch handling revised (overlooked commit) + * sh2 drc: drc exit, block linking and branch handling revised + * sh2 drc: improved RTS call stack cache + * sh2 drc: rework of register cache to implement basic loop optmization + * various smallish optimizations, cleanups, and bug fixes + * cleanup and microoptimizations in SH2 hw handling + * some drawing code C optimisations + * bug fix in comm poll fifo, and back to -O3 + * pff... README, 2nd try + * configuration changes and README + * cleanup config files, copyright stuff + * fix for mkoffsets without multiarch binutils + * various small fixes and optimsations + * sh2 drc: add aarch64 backend for A64 + * sh2 drc: add mipsel backend for MIPS32 Release 1 (for JZ47xx) + * SH2 drc: register cache overhaul (bugfixing, speed, readability) + * SH2 drc: bug fixing and small speed improvements + * 32X: memory access and polling bug fixes + * sh2 drc, x86 code emitter: use x86-64 registers R8-R15 + * 32x DMA memory copy performance optimisation + * sh2 drc, change utils abi to pass sh2 PC in arg0 (reduces compiled code size) + * sh2 drc, keep T bit in host flags as long as possible + * add xSR/RTS call stack cache to sh2 drc + * polling detection: communication poll fifo to avoid comm data loss + * sh2 memory access improvements, revive ARM asm memory functions + * sh2 drc, register cache optimisations + * sh2 drc, block management bugfixes and cleanup + * sh2 drc, add detection for in-memory polling + * sh2 drc, add loop detector, handle delay/idle loops + * sh2 drc, code emitter cleanup, add ARM reorder stage to reduce interlock + * sh2 drc, make B/W read functions signed (reduces generated code size) + * sh2 drc, improved constant handling and register allocator + * speed improvement and fixes for 32x ARM asm draw + * add literal pool to sh2 drc (for armv[456] without MOVT/W) + * sh2 drc, reuse blocks if already previously compiled (speedup for Virtua *) + * various small improvements and fixes + * overhaul of translation cache and sh2 literals handling + * added branch cache to sh2 drc to improve cross-tcache jump speed + * sh2 memory interface optimzations + * overhaul of the register cache (improves generated code by some 10+%) + * debug stuff, bug fixing + * move saving SH2 SR into memory access and do so only if needed + * add 32bit memory access functions for SH2 + * sh2 drc: sh2 addr modes generalization, more const propagation, code gen optimizations + * DRC: reworked scan_block (fix register usage masks, better block and literals detection) + * minor changes + * reworked palette and buffer handling due to some 32X bugs + * revamped 32X draw arm asm code + * kludges for wwf raw, nfl + * substituted tool to obtain target structure offsets (for asm) + * improved sh2 clock handling, bug fixing + small improvement to drc emitters + * sh2 drc host disassembler integration for gp2x + * bugfix for 32x + * bfd-less arm disassembler for gph + * config for x86 (32 bit only, for SH2 drc), add/revive profiling + * arm asm memory access functions for m/s68k + * config templates for gp2x, caanoo, dingux either with system toolchain (open2x,gph,opendingux) or ubuntu arm(gcc 4.7 is highest possible),mips + * arm asm syntax fixes for open2x + * make gp2x mp3 playback functional (need to unpack and compile helix decoder separately in platform/common/helix) + * fix gp2x compilation (using linaro arm gcc 4.7 on ubuntu) + * release 1.93 + * libretro: Allow setting GIT_VERSION. + * Makefile: Build with optimizations if DEBUG=0 + * Remove not longer files in Picodrive for PS2 + * Change GSKit PS2 version + * Update picodrive + * Define HAVE_NO_LANGEXTRA + * Adapt to newlib + * Add option to change sound quality + * Copy tile-based fast renderer buffer + * Add option to change renderer
View file
libretro-picodrive.spec
Changed
@@ -17,7 +17,7 @@ Name: libretro-picodrive -Version: 0~git20200112 +Version: 0~git20200716 Release: 0 Summary: PicoDrive libretro core for MegaDrive/Genesis emulation License: NonFree
View file
_servicedata
Changed
@@ -1,4 +1,4 @@ <servicedata> <service name="tar_scm"> <param name="url">https://github.com/libretro/picodrive.git</param> - <param name="changesrevision">495dddc0ad7847ab4a6ffeb54da3795a3f5a0e6d</param></service></servicedata> \ No newline at end of file + <param name="changesrevision">e6b80af200f19d8d518d427dc20314606e9d8510</param></service></servicedata> \ No newline at end of file
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/DrZ80/drz80.s
Deleted
@@ -1,8099 +0,0 @@ -;@ Reesy's Z80 Emulator Version 0.001 - -;@ (c) Copyright 2004 Reesy, All rights reserved -;@ DrZ80 is free for non-commercial use. - -;@ For commercial use, separate licencing terms must be obtained. - - .data - .align 4 - - .global DrZ80Run - .global DrZ80Ver - - .equiv INTERRUPT_MODE, 0 ;@0 = Use internal int handler, 1 = Use Mames int handler - .equiv FAST_Z80SP, 0 ;@0 = Use mem functions for stack pointer, 1 = Use direct mem pointer - .equiv UPDATE_CONTEXT, 0 - .equiv DRZ80_XMAP, 1 - .equiv DRZ80_XMAP_MORE_INLINE, 1 - -.if DRZ80_XMAP - .equ Z80_MEM_SHIFT, 13 -.endif - -.if INTERRUPT_MODE - .extern Interrupt -.endif - -DrZ80Ver: .long 0x0001 - -;@ --------------------------- Defines ---------------------------- -;@ Make sure that regs/pointers for z80pc to z80sp match up! - - z80_icount .req r3 - opcodes .req r4 - cpucontext .req r5 - z80pc .req r6 - z80a .req r7 - z80f .req r8 - z80bc .req r9 - z80de .req r10 - z80hl .req r11 - z80sp .req r12 - z80xx .req lr - - .equ z80pc_pointer, 0 ;@ 0 - .equ z80a_pointer, z80pc_pointer+4 ;@ 4 - .equ z80f_pointer, z80a_pointer+4 ;@ 8 - .equ z80bc_pointer, z80f_pointer+4 ;@ - .equ z80de_pointer, z80bc_pointer+4 - .equ z80hl_pointer, z80de_pointer+4 - .equ z80sp_pointer, z80hl_pointer+4 - .equ z80pc_base, z80sp_pointer+4 - .equ z80sp_base, z80pc_base+4 - .equ z80ix, z80sp_base+4 - .equ z80iy, z80ix+4 - .equ z80i, z80iy+4 - .equ z80a2, z80i+4 - .equ z80f2, z80a2+4 - .equ z80bc2, z80f2+4 - .equ z80de2, z80bc2+4 - .equ z80hl2, z80de2+4 - .equ cycles_pointer, z80hl2+4 - .equ previouspc, cycles_pointer+4 - .equ z80irq, previouspc+4 - .equ z80if, z80irq+1 - .equ z80im, z80if+1 - .equ z80r, z80im+1 - .equ z80irqvector, z80r+1 - .equ z80irqcallback, z80irqvector+4 - .equ z80_write8, z80irqcallback+4 - .equ z80_write16, z80_write8+4 - .equ z80_in, z80_write16+4 - .equ z80_out, z80_in+4 - .equ z80_read8, z80_out+4 - .equ z80_read16, z80_read8+4 - .equ z80_rebaseSP, z80_read16+4 - .equ z80_rebasePC, z80_rebaseSP+4 - - .equ VFlag, 0 - .equ CFlag, 1 - .equ ZFlag, 2 - .equ SFlag, 3 - .equ HFlag, 4 - .equ NFlag, 5 - .equ Flag3, 6 - .equ Flag5, 7 - - .equ Z80_CFlag, 0 - .equ Z80_NFlag, 1 - .equ Z80_VFlag, 2 - .equ Z80_Flag3, 3 - .equ Z80_HFlag, 4 - .equ Z80_Flag5, 5 - .equ Z80_ZFlag, 6 - .equ Z80_SFlag, 7 - - .equ Z80_IF1, 1<<0 - .equ Z80_IF2, 1<<1 - .equ Z80_HALT, 1<<2 - .equ Z80_NMI, 1<<3 - -;@--------------------------------------- - -.text - -.if DRZ80_XMAP - -z80_xmap_read8: @ addr - ldr r1,[cpucontext,#z80_read8] - mov r2,r0,lsr #Z80_MEM_SHIFT - ldr r1,[r1,r2,lsl #2] - movs r1,r1,lsl #1 - ldrccb r0,[r1,r0] - bxcc lr - -z80_xmap_read8_handler: @ addr, func - str z80_icount,[cpucontext,#cycles_pointer] - stmfd sp!,{r12,lr} - mov lr,pc - bx r1 - ldr z80_icount,[cpucontext,#cycles_pointer] - ldmfd sp!,{r12,pc} - -z80_xmap_write8: @ data, addr - ldr r2,[cpucontext,#z80_write8] - add r2,r2,r1,lsr #Z80_MEM_SHIFT-2 - bic r2,r2,#3 - ldr r2,[r2] - movs r2,r2,lsl #1 - strccb r0,[r2,r1] - bxcc lr - -z80_xmap_write8_handler: @ data, addr, func - str z80_icount,[cpucontext,#cycles_pointer] - mov r3,r0 - mov r0,r1 - mov r1,r3 - stmfd sp!,{r12,lr} - mov lr,pc - bx r2 - ldr z80_icount,[cpucontext,#cycles_pointer] - ldmfd sp!,{r12,pc} - -z80_xmap_read16: @ addr - @ check if we cross bank boundary - add r1,r0,#1 - eor r1,r1,r0 - tst r1,#1<<Z80_MEM_SHIFT - bne 0f - - ldr r1,[cpucontext,#z80_read8] - mov r2,r0,lsr #Z80_MEM_SHIFT - ldr r1,[r1,r2,lsl #2] - movs r1,r1,lsl #1 - bcs 0f - ldrb r0,[r1,r0]! - ldrb r1,[r1,#1] - orr r0,r0,r1,lsl #8 - bx lr - -0: - @ z80_xmap_read8 will save r3 and r12 for us - stmfd sp!,{r8,r9,lr} - mov r8,r0 - bl z80_xmap_read8 - mov r9,r0 - add r0,r8,#1 - bl z80_xmap_read8 - orr r0,r9,r0,lsl #8 - ldmfd sp!,{r8,r9,pc} - -z80_xmap_write16: @ data, addr - add r2,r1,#1 - eor r2,r2,r1 - tst r2,#1<<Z80_MEM_SHIFT - bne 0f - - ldr r2,[cpucontext,#z80_write8] - add r2,r2,r1,lsr #Z80_MEM_SHIFT-2 - bic r2,r2,#3 - ldr r2,[r2] - movs r2,r2,lsl #1 - bcs 0f - strb r0,[r2,r1]! - mov r0,r0,lsr #8 - strb r0,[r2,#1] - bx lr - -0: - stmfd sp!,{r8,r9,lr} - mov r8,r0 - mov r9,r1 - bl z80_xmap_write8 - mov r0,r8,lsr #8 - add r1,r9,#1 - bl z80_xmap_write8 - ldmfd sp!,{r8,r9,pc} - -z80_xmap_rebase_pc:
View file
libretro-picodrive-0~git20200112.tar.xz/pico/32x/draw_arm.s
Deleted
@@ -1,373 +0,0 @@ -@* -@* PicoDrive -@* (C) notaz, 2010 -@* -@* This work is licensed under the terms of MAME license. -@* See COPYING file in the top-level directory. -@* - -.extern Pico32x -.extern PicoDraw2FB -.extern HighPal - -.equiv P32XV_PRI, (1<< 7) - -.bss -.align 2 -.global Pico32xNativePal -Pico32xNativePal: - .word 0 - -.text -.align 2 - - -.macro call_scan_prep cond -.if \cond - ldr r4, =PicoScan32xBegin - ldr r5, =PicoScan32xEnd - ldr r6, =DrawLineDest - ldr r4, [r4] - ldr r5, [r5] - stmfd sp!, {r4,r5,r6} -.endif -.endm - -.macro call_scan_fin_ge cond -.if \cond - addge sp, sp, #4*3 -.endif -.endm - -.macro call_scan_begin cond -.if \cond - stmfd sp!, {r1-r3} - and r0, r2, #0xff - add r0, r0, r4 - mov lr, pc - ldr pc, [sp, #(3+0)*4] - ldr r0, [sp, #(3+2)*4] @ &DrawLineDest - ldmfd sp!, {r1-r3} - ldr r0, [r0] -.endif -.endm - -.macro call_scan_end cond -.if \cond - stmfd sp!, {r0-r3} - and r0, r2, #0xff - add r0, r0, r4 - mov lr, pc - ldr pc, [sp, #(4+1)*4] - ldmfd sp!, {r0-r3} -.endif -.endm - -@ direct color -@ unsigned short *dst, unsigned short *dram, int lines_sft_offs, int mdbg -.macro make_do_loop_dc name call_scan do_md -.global \name -\name: - stmfd sp!, {r4-r11,lr} - - ldr r10,=Pico32x - ldr r11,=PicoDraw2FB - ldr r10,[r10, #0x40] @ Pico32x.vdp_regs[0] - ldr r11,[r11] - ldr r9, =HighPal @ palmd - and r4, r2, #0xff - mov r5, #328 - lsl r3, #26 @ mdbg << 26 - mla r11,r4,r5,r11 @ r11 = pmd = PicoDraw2FB + offs*328: md data - tst r10,#P32XV_PRI - moveq r10,#0 - movne r10,#0x8000 @ r10 = inv_bit - call_scan_prep \call_scan - - mov r4, #0 @ line - b 1f @ loop_outer_entry - -0: @ loop_outer: - call_scan_end \call_scan - add r4, r4, #1 - sub r11,r11,#1 @ adjust for prev read - cmp r4, r2, lsr #16 - call_scan_fin_ge \call_scan - ldmgefd sp!, {r4-r11,pc} - -1: @ loop_outer_entry: - call_scan_begin \call_scan - mov r12,r4, lsl #1 - ldrh r12,[r1, r12] - add r11,r11,#8 - mov r6, #320 - add r5, r1, r12, lsl #1 @ p32x = dram + dram[l] - -2: @ loop_inner: - ldrb r7, [r11], #1 @ MD pixel - subs r6, r6, #1 - blt 0b @ loop_outer - ldrh r8, [r5], #2 @ 32x pixel - cmp r3, r7, lsl #26 @ MD has bg pixel? - beq 3f @ draw32x - eor r12,r8, r10 - ands r12,r12,#0x8000 @ !((t ^ inv) & 0x8000) -.if \do_md - mov r7, r7, lsl #1 - ldreqh r12,[r9, r7] - streqh r12,[r0], #2 @ *dst++ = palmd[*pmd] -.endif - beq 2b @ loop_inner - -3: @ draw32x: - and r12,r8, #0x03e0 - mov r8, r8, lsl #11 - orr r8, r8, r8, lsr #(10+11) - orr r8, r8, r12,lsl #1 - bic r8, r8, #0x0020 @ kill prio bit - strh r8, [r0], #2 @ *dst++ = bgr2rgb(*p32x++) - b 2b @ loop_inner -.endm - - -@ packed pixel -@ note: this may read a few bytes over the end of PicoDraw2FB and dram, -@ so those should have a bit more alloc'ed than really needed. -@ unsigned short *dst, unsigned short *dram, int lines_sft_offs, int mdbg -.macro make_do_loop_pp name call_scan do_md -.global \name -\name: - stmfd sp!, {r4-r11,lr} - - ldr r11,=PicoDraw2FB - ldr r10,=Pico32xNativePal - ldr r11,[r11] - ldr r10,[r10] - ldr r9, =HighPal @ palmd - and r4, r2, #0xff - mov r5, #328 - lsl r3, #26 @ mdbg << 26 - mla r11,r4,r5,r11 @ r11 = pmd = PicoDraw2FB + offs*328: md data - call_scan_prep \call_scan - - mov r4, #0 @ line - b 1f @ loop_outer_entry - -0: @ loop_outer: - call_scan_end \call_scan - add r4, r4, #1 - cmp r4, r2, lsr #16 - call_scan_fin_ge \call_scan - ldmgefd sp!, {r4-r11,pc} - -1: @ loop_outer_entry: - call_scan_begin \call_scan - mov r12,r4, lsl #1 - ldrh r12,[r1, r12] - add r11,r11,#8 - mov r6, #320/2 - add r5, r1, r12, lsl #1 @ p32x = dram + dram[l] - and r12,r2, #0x100 @ shift - add r5, r5, r12,lsr #8 - -2: @ loop_inner: -@ r4,r6 - counters; r5 - 32x data; r9,r10 - md,32x pal; r11 - md data -@ r7,r8,r12,lr - temp - tst r5, #1 - ldreqb r8, [r5], #2 - ldrb r7, [r5, #-1] - ldrneb r8, [r5, #2]! @ r7,r8 - pixel 0,1 index - subs r6, r6, #1 - blt 0b @ loop_outer - cmp r7, r8 - beq 5f @ check_fill @ +8 - -3: @ no_fill: - mov r12,r7, lsl #1 - mov lr, r8, lsl #1 - ldrh r7, [r10,r12] - ldrh r8, [r10,lr] - add r11,r11,#2 - - eor r12,r7, #0x20 - tst r12,#0x20 - ldrneb r12,[r11,#-2] @ MD pixel 0 - eor lr, r8, #0x20 - cmpne r3, r12, lsl #26 @ MD has bg pixel? -.if \do_md - mov r12,r12,lsl #1 - ldrneh r7, [r9, r12] @ t = palmd[pmd[0]]
View file
libretro-picodrive-0~git20200112.tar.xz/pico/pico_int_o32.h
Deleted
@@ -1,28 +0,0 @@ -/* autogenerated by tools/mkoffsets, do not edit */ -#define OFS_Pico_video_reg 0x0000 -#define OFS_Pico_m_rotate 0x0040 -#define OFS_Pico_m_z80Run 0x0041 -#define OFS_Pico_m_dirtyPal 0x0046 -#define OFS_Pico_m_hardware 0x0047 -#define OFS_Pico_m_z80_reset 0x004f -#define OFS_Pico_m_sram_reg 0x0049 -#define OFS_Pico_sv 0x008c -#define OFS_Pico_sv_data 0x008c -#define OFS_Pico_sv_start 0x0090 -#define OFS_Pico_sv_end 0x0094 -#define OFS_Pico_sv_flags 0x0098 -#define OFS_Pico_rom 0x033c -#define OFS_Pico_romsize 0x0340 -#define OFS_EST_DrawScanline 0x00 -#define OFS_EST_rendstatus 0x04 -#define OFS_EST_DrawLineDest 0x08 -#define OFS_EST_HighCol 0x0c -#define OFS_EST_HighPreSpr 0x10 -#define OFS_EST_Pico 0x14 -#define OFS_EST_PicoMem_vram 0x18 -#define OFS_EST_PicoMem_cram 0x1c -#define OFS_EST_PicoOpt 0x20 -#define OFS_EST_Draw2FB 0x24 -#define OFS_EST_HighPal 0x28 -#define OFS_PMEM_vram 0x10000 -#define OFS_PMEM_vsram 0x22100
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/ym2612_arm.s
Deleted
@@ -1,907 +0,0 @@ -/* - * PicoDrive - * (C) notaz, 2006 - * - * This work is licensed under the terms of MAME license. - * See COPYING file in the top-level directory. - */ - -@ this is a rewrite of MAME's ym2612 code, in particular this is only the main sample-generatin loop. -@ it does not seem to give much performance increase (if any at all), so don't use it if it causes trouble. -@ - notaz, 2006 - -@ vim:filetype=armasm - -.equiv SLOT1, 0 -.equiv SLOT2, 2 -.equiv SLOT3, 1 -.equiv SLOT4, 3 -.equiv SLOT_STRUCT_SIZE, 0x30 - -.equiv TL_TAB_LEN, 0x1A00 - -.equiv EG_ATT, 4 -.equiv EG_DEC, 3 -.equiv EG_SUS, 2 -.equiv EG_REL, 1 -.equiv EG_OFF, 0 - -.equiv EG_SH, 16 @ 16.16 fixed point (envelope generator timing) -.equiv EG_TIMER_OVERFLOW, (3*(1<<EG_SH)) @ envelope generator timer overflows every 3 samples (on real chip) -.equiv LFO_SH, 25 /* 7.25 fixed point (LFO calculations) */ - -.equiv ENV_QUIET, (2*13*256/8) - -.text -.align 2 - -@ r5=slot, r1=eg_cnt, trashes: r0,r2,r3 -@ writes output to routp, but only if vol_out changes -.macro update_eg_phase_slot slot - ldrb r2, [r5,#0x17] @ state - add r3, r5, #0x1c - tst r2, r2 - beq 0f @ EG_OFF - - ldr r2, [r3, r2, lsl #2] @ pack - mov r3, #1 - mov r0, r2, lsr #24 @ shift - mov r3, r3, lsl r0 - sub r3, r3, #1 - - tst r1, r3 - bne 0f @ no volume change - - mov r3, r1, lsr r0 - and r3, r3, #7 - add r3, r3, r3, lsl #1 - mov r3, r2, lsr r3 - and r3, r3, #7 @ eg_inc_val shift, may be 0 - ldrb r2, [r5,#0x17] @ state - ldrh r0, [r5,#0x1a] @ volume, unsigned (0-1023) - - cmp r2, #4 @ EG_ATT - beq 4f - cmp r2, #2 - mov r2, #1 - mov r2, r2, lsl r3 - mov r2, r2, lsr #1 @ eg_inc_val - add r0, r0, r2 - blt 1f @ EG_REL - beq 2f @ EG_SUS - -3: @ EG_DEC - ldr r2, [r5,#0x1c] @ sl (can be 16bit?) - mov r3, #EG_SUS - cmp r0, r2 @ if ( volume >= (INT32) SLOT->sl ) - strgeb r3, [r5,#0x17] @ state - b 10f - -4: @ EG_ATT - subs r3, r3, #1 @ eg_inc_val_shift - 1 - mov r2, #0 - mvnpl r2, r0 - mov r2, r2, lsl r3 - add r0, r0, r2, asr #4 - cmp r0, #0 @ if (volume <= MIN_ATT_INDEX) - movle r3, #EG_DEC - strleb r3, [r5,#0x17] @ state - movle r0, #0 - b 10f - -2: @ EG_SUS - mov r2, #1024 - sub r2, r2, #1 @ r2 = MAX_ATT_INDEX - cmp r0, r2 @ if ( volume >= MAX_ATT_INDEX ) - movge r0, r2 - b 10f - -1: @ EG_REL - mov r2, #1024 - sub r2, r2, #1 @ r2 = MAX_ATT_INDEX - cmp r0, r2 @ if ( volume >= MAX_ATT_INDEX ) - movge r0, r2 - movge r3, #EG_OFF - strgeb r3, [r5,#0x17] @ state - -10: @ finish - ldrh r3, [r5,#0x18] @ tl - strh r0, [r5,#0x1a] @ volume -.if \slot == SLOT1 - mov r6, r6, lsr #16 - add r0, r0, r3 - orr r6, r0, r6, lsl #16 -.elseif \slot == SLOT2 - mov r6, r6, lsl #16 - add r0, r0, r3 - mov r0, r0, lsl #16 - orr r6, r0, r6, lsr #16 -.elseif \slot == SLOT3 - mov r7, r7, lsr #16 - add r0, r0, r3 - orr r7, r0, r7, lsl #16 -.elseif \slot == SLOT4 - mov r7, r7, lsl #16 - add r0, r0, r3 - mov r0, r0, lsl #16 - orr r7, r0, r7, lsr #16 -.endif - -0: @ EG_OFF -.endm - - -@ r12=lfo_ampm[31:16], r1=lfo_cnt_old, r2=lfo_cnt, r3=scratch -.macro advance_lfo_m - mov r2, r2, lsr #LFO_SH - cmp r2, r1, lsr #LFO_SH - beq 0f - and r3, r2, #0x3f - cmp r2, #0x40 - rsbge r3, r3, #0x3f - bic r12,r12, #0xff000000 @ lfo_ampm &= 0xff - orr r12,r12, r3, lsl #1+24 - - mov r2, r2, lsr #2 - cmp r2, r1, lsr #LFO_SH+2 - bicne r12,r12, #0xff0000 - orrne r12,r12, r2, lsl #16 - -0: -.endm - - -@ result goes to r1, trashes r2 -.macro make_eg_out slot - tst r12, #8 - tstne r12, #(1<<(\slot+8)) -.if \slot == SLOT1 - mov r1, r6, lsl #16 - mov r1, r1, lsr #16 -.elseif \slot == SLOT2 - mov r1, r6, lsr #16 -.elseif \slot == SLOT3 - mov r1, r7, lsl #16 - mov r1, r1, lsr #16 -.elseif \slot == SLOT4 - mov r1, r7, lsr #16 -.endif - andne r2, r12, #0xc0 - movne r2, r2, lsr #6 - addne r2, r2, #24 - addne r1, r1, r12, lsr r2 - bic r1, r1, #1 -.endm - - -@ \r=sin/result, r1=env, r3=ym_tl_tab -.macro lookup_tl r - tst \r, #0x100 - eorne \r, \r, #0xff @ if (sin & 0x100) sin = 0xff - (sin&0xff); - tst \r, #0x200 - and \r, \r, #0xff - orr \r, \r, r1, lsl #7 - mov \r, \r, lsl #1 - ldrh \r, [r3, \r] @ 2ci if ne - rsbne \r, \r, #0 -.endm - - -@ lr=context, r12=pack (stereo, lastchan, disabled, lfo_enabled | pan_r, pan_l, ams[2] | AMmasks[4] | FB[4] | lfo_ampm[16]) -@ r0-r2=scratch, r3=sin_tab, r5=scratch, r6-r7=vol_out[4], r10=op1_out -.macro upd_algo0_m - - @ SLOT3 - make_eg_out SLOT3 - cmp r1, #ENV_QUIET - movcs r0, #0 - bcs 0f - ldr r2, [lr, #0x18]
View file
libretro-picodrive-0~git20200112.tar.xz/platform/ps2/SMS_Utils.s
Deleted
@@ -1,202 +0,0 @@ -/* -# ___ _ _ ___ -# | | | | | -# ___| | | ___| PS2DEV Open Source Project. -#---------------------------------------------------------- -# MUL64 is pulled from some binary library (I don't remember which one). -# mips_memcpy routine is pulled from 'sde' library from MIPS. -# -*/ -.set noat -.set noreorder -.set nomacro - -.globl MUL64 -.globl mips_memcpy -.globl mips_memset - -.text - -MUL64: - pmultuw $v0, $a0, $a1 - dsra32 $a2, $a0, 0 - dsra32 $v1, $a1, 0 - mult $v1, $a0, $v1 - mult1 $a2, $a2, $a1 - addu $v1, $v1, $a2 - dsll32 $v1, $v1, 0 - jr $ra - daddu $v0, $v0, $v1 - -mips_memcpy: - addu $v0, $a0, $zero - beqz $a2, 1f - sltiu $t2, $a2, 12 - bnez $t2, 2f - xor $v1, $a1, $a0 - andi $v1, $v1, 7 - negu $a3, $a0 - beqz $v1, 3f - andi $a3, $a3, 7 - beqz $a3, 4f - subu $a2, $a2, $a3 - ldr $v1, 0($a1) - ldl $v1, 7($a1) - addu $a1, $a1, $a3 - sdr $v1, 0($a0) - addu $a0, $a0, $a3 -4: - andi $v1, $a2, 31 - subu $a3, $a2, $v1 - beqz $a3, 5f - addu $a2, $v1, $zero - addu $a3, $a3, $a1 -6: - ldr $v1, 0($a1) - ldl $v1, 7($a1) - ldr $t0, 8($a1) - ldl $t0, 15($a1) - ldr $t1, 16($a1) - ldl $t1, 23($a1) - ldr $t2, 24($a1) - ldl $t2, 31($a1) - sd $v1, 0($a0) - sd $t0, 8($a0) - sd $t1, 16($a0) - addiu $a1, $a1, 32 - addiu $a0, $a0, 32 - bne $a1, $a3, 6b - sd $t2, -8($a0) -5: - andi $v1, $a2, 7 - subu $a3, $a2, $v1 - beqz $a3, 2f - addu $a2, $v1, $zero - addu $a3, $a3, $a1 -7: - ldr $v1, 0($a1) - ldl $v1, 7($a1) - addiu $a1, $a1, 8 - addiu $a0, $a0, 8 - nop - bne $a1, $a3, 7b - sd $v1, -8($a0) - beq $zero, $zero, 2f - nop -3: - beqz $a3, 8f - subu $a2, $a2, $a3 - ldr $v1, 0($a1) - addu $a1, $a1, $a3 - sdr $v1, 0($a0) - addu $a0, $a0, $a3 -8: - andi $v1, $a2, 31 - subu $a3, $a2, $v1 - beqz $a3, 9f - addu $a2, $v1, $zero - addu $a3, $a3, $a1 -10: - ld $v1, 0($a1) - ld $t0, 8($a1) - ld $t1, 16($a1) - ld $t2, 24($a1) - sd $v1, 0($a0) - sd $t0, 8($a0) - sd $t1, 16($a0) - addiu $a1, $a1, 32 - addiu $a0, $a0, 32 - bne $a1, $a3, 10b - sd $t2, -8($a0) -9: - andi $v1, $a2, 7 - subu $a3, $a2, $v1 - beqz $a3, 2f - addu $a2, $v1, $zero - addu $a3, $a3, $a1 -11: - ld $v1, 0($a1) - addiu $a1, $a1, 8 - addiu $a0, $a0, 8 - nop - nop - bne $a1, $a3, 11b - sd $v1, -8($a0) -2: - beqz $a2, 1f - addu $a3, $a2, $a1 -12: - lbu $v1, 0($a1) - addiu $a1, $a1, 1 - addiu $a0, $a0, 1 - nop - nop - bne $a1, $a3, 12b - sb $v1, -1($a0) -1: - jr $ra - nop - -mips_memset: - beqz $a2, 1f - sltiu $at, $a2, 16 - bnez $at, 2f - andi $a1, $a1, 0xFF - dsll $at, $a1, 0x8 - or $a1, $a1, $at - dsll $at, $a1, 0x10 - or $a1, $a1, $at - dsll32 $at, $a1, 0x0 - or $a1, $a1, $at - andi $v1, $a0, 0x7 - beqz $v1, 3f - li $a3, 8 - subu $a3, $a3, $v1 - subu $a2, $a2, $a3 - sdr $a1, 0($a0) - addu $a0, $a0, $a3 -3: - andi $v1, $a2, 0x1f - subu $a3, $a2, $v1 - beqz $a3, 4f - move $a2, $v1 - addu $a3, $a3, $a0 -5: - sd $a1, 0($a0) - sd $a1, 8($a0) - sd $a1, 16($a0) - addiu $a0, $a0, 32 - sd $a1, -8($a0) - bne $a0, $a3, 5b -4: - andi $v1, $a2, 0x7 - subu $a3, $a2, $v1 - beqz $a3, 2f - move $a2, $v1 - addu $a3, $a3, $a0 -6: - addiu $a0, $a0, 8 - beq $a0, $a3, 2f - sd $a1, -8($a0) - addiu $a0, $a0, 8 - beq $a0, $a3, 2f - sd $a1, -8($a0) - addiu $a0, $a0, 8 - bne $a0, $a3, 6b - sd $a1, -8($a0) -2: - beqz $a2, 1f - addu $a3, $a2, $a0 -7: - addiu $a0, $a0, 1 - beq $a0, $a3, 1f - sb $a1, -1($a0) - addiu $a0, $a0, 1 - beq $a0, $a3, 1f - sb $a1, -1($a0) - addiu $a0, $a0, 1 - bne $a0, $a3, 7b - sb $a1, -1($a0)
View file
libretro-picodrive-0~git20200112.tar.xz/platform/ps2/stdint.h
Deleted
@@ -1,16 +0,0 @@ -#ifndef STDINT_H -#define STDINT_H - -typedef unsigned long uintptr_t; -typedef signed long intptr_t; - -typedef signed char int8_t; -typedef signed short int16_t; -typedef signed int int32_t; -typedef signed long int64_t; -typedef unsigned char uint8_t; -typedef unsigned short uint16_t; -typedef unsigned int uint32_t; -typedef unsigned long uint64_t; - -#endif //STDINT_H
View file
libretro-picodrive-0~git20200112.tar.xz/Makefile -> libretro-picodrive-0~git20200716.tar.xz/Makefile
Changed
@@ -4,19 +4,6 @@ CYCLONE_CC ?= gcc CYCLONE_CXX ?= g++ -ifneq ("$(PLATFORM)", "libretro") - CFLAGS += -Wall -g - ifndef DEBUG - CFLAGS += -O3 -DNDEBUG - endif -endif - -# This is actually needed, bevieve me. -# If you really have to disable this, set NO_ALIGN_FUNCTIONS elsewhere. -ifndef NO_ALIGN_FUNCTIONS -CFLAGS += -falign-functions=2 -endif - all: config.mak target_ ifndef NO_CONFIG_MAK @@ -34,6 +21,39 @@ config.mak: endif +# This is actually needed, believe me. +# If you really have to disable this, set NO_ALIGN_FUNCTIONS elsewhere. +ifndef NO_ALIGN_FUNCTIONS +CFLAGS += -falign-functions=2 +endif + +# profiling +pprof ?= 0 +gperf ?= 0 + +ifneq ("$(PLATFORM)", "libretro") + CFLAGS += -Wall -g +ifneq ($(findstring gcc,$(CC)),) + CFLAGS += -ffunction-sections -fdata-sections + LDFLAGS += -Wl,--gc-sections +endif + ifndef DEBUG + CFLAGS += -O3 -DNDEBUG + endif + + LD = $(CC) + OBJOUT ?= -o + LINKOUT ?= -o +endif + +ifeq ("$(PLATFORM)",$(filter "$(PLATFORM)","gp2x" "opendingux" "rpi1")) +# very small caches, avoid optimization options making the binary much bigger +CFLAGS += -finline-limit=42 -fno-unroll-loops -fno-ipa-cp -ffast-math +# this gets you about 20% better execution speed on 32bit arm/mips +CFLAGS += -fno-common -fno-stack-protector -fno-guess-branch-probability -fno-caller-saves -fno-tree-loop-if-convert -fno-regmove +endif +#OBJS += align.o + # default settings ifeq "$(ARCH)" "arm" use_cyclone ?= 1 @@ -47,10 +67,12 @@ asm_misc ?= 1 asm_cdmemory ?= 1 asm_mix ?= 1 -else # if not arm +asm_32xdraw ?= 1 +asm_32xmemory ?= 1 +else use_fame ?= 1 use_cz80 ?= 1 -ifneq (,$(findstring 86,$(ARCH))) +ifneq (,$(filter x86% i386% mips% aarch% riscv% powerpc% ppc%, $(ARCH))) use_sh2drc ?= 1 endif endif @@ -91,6 +113,7 @@ USE_FRONTEND = 1 endif ifeq "$(PLATFORM)" "generic" +CFLAGS += -DSDL_OVERLAY_2X OBJS += platform/linux/emu.o platform/linux/blit.o # FIXME OBJS += platform/common/plat_sdl.o OBJS += platform/libpicofe/plat_sdl.o platform/libpicofe/in_sdl.o @@ -124,6 +147,8 @@ OBJS += platform/gp2x/warm.o USE_FRONTEND = 1 PLATFORM_MP3 = 1 +PLATFORM_ZLIB = 1 +HAVE_ARMv6 = 0 endif ifeq "$(PLATFORM)" "libretro" OBJS += platform/libretro/libretro.o @@ -135,6 +160,7 @@ OBJS += platform/libretro/libretro-common/streams/file_stream_transforms.o OBJS += platform/libretro/libretro-common/vfs/vfs_implementation.o endif +PLATFORM_ZLIB = 1 endif ifeq "$(USE_FRONTEND)" "1" @@ -169,15 +195,17 @@ endif # USE_FRONTEND -OBJS += platform/common/mp3.o +OBJS += platform/common/mp3.o platform/common/mp3_sync.o ifeq "$(PLATFORM_MP3)" "1" +platform/common/mp3_helix.o: CFLAGS += -Iplatform/libpicofe +OBJS += platform/common/mp3_helix.o else ifeq "$(HAVE_LIBAVCODEC)" "1" OBJS += platform/common/mp3_libavcodec.o else OBJS += platform/common/mp3_dummy.o endif -ifeq "$(PLATFORM)" "libretro" +ifeq "$(PLATFORM_ZLIB)" "1" # zlib OBJS += zlib/gzio.o zlib/inffast.o zlib/inflate.o zlib/inftrees.o zlib/trees.o \ zlib/deflate.o zlib/crc32.o zlib/adler32.o zlib/zutil.o zlib/compress.o zlib/uncompr.o @@ -201,7 +229,7 @@ target_: $(TARGET) clean: - $(RM) $(TARGET) $(OBJS) + $(RM) $(TARGET) $(OBJS) pico/pico_int_offs.h $(RM) -r .opk_data $(TARGET): $(OBJS) @@ -213,10 +241,10 @@ endif pprof: platform/linux/pprof.c - $(CC) -O2 -ggdb -DPPROF -DPPROF_TOOL -I../../ -I. $^ -o $@ + $(CC) $(CFLAGS) -O2 -ggdb -DPPROF -DPPROF_TOOL -I../../ -I. $^ -o $@ $(LDFLAGS) $(LDLIBS) -tools/textfilter: tools/textfilter.c - make -C tools/ textfilter +pico/pico_int_offs.h: tools/mkoffsets.sh + make -C tools/ XCC="$(CC)" XCFLAGS="$(CFLAGS)" XPLATFORM="$(platform)" %.o: %.c $(CC) -c $(OBJOUT)$@ $< $(CFLAGS) @@ -236,6 +264,14 @@ pico/cd/pcm.o: CFLAGS += -fno-strict-aliasing pico/cd/LC89510.o: CFLAGS += -fno-strict-aliasing pico/cd/gfx_cd.o: CFLAGS += -fno-strict-aliasing +ifeq (1,$(use_sh2drc)) +ifneq (,$(findstring -flto,$(CFLAGS))) +# if using the DRC, memory and sh2soc directly use the DRC register for SH2 SR +# to avoid saving and reloading it. However, this collides with the use of LTO. +pico/32x/memory.o: CFLAGS += -fno-lto +pico/32x/sh2soc.o: CFLAGS += -fno-lto +endif +endif # fame needs ~2GB of RAM to compile on gcc 4.8 # on x86, this is reduced by ~300MB when debug info is off (but not on ARM) @@ -252,10 +288,13 @@ pico/carthw_cfg.c: pico/carthw.cfg tools/make_carthw_c $< $@ +# preprocessed asm files most probably include the offsets file +$(filter %.S,$(SRCS_COMMON)): pico/pico_int_offs.h + # random deps pico/carthw/svp/compiler.o : cpu/drc/emit_arm.c -cpu/sh2/compiler.o : cpu/drc/emit_arm.c -cpu/sh2/compiler.o : cpu/drc/emit_x86.c +cpu/sh2/compiler.o : cpu/drc/emit_arm.c cpu/drc/emit_arm64.c cpu/drc/emit_ppc.c +cpu/sh2/compiler.o : cpu/drc/emit_x86.c cpu/drc/emit_mips.c cpu/drc/emit_riscv.c cpu/sh2/mame/sh2pico.o : cpu/sh2/mame/sh2.c pico/pico.o pico/cd/mcd.o pico/32x/32x.o : pico/pico_cmn.c pico/pico_int.h pico/memory.o pico/cd/memory.o pico/32x/memory.o : pico/pico_int.h pico/memory.h
View file
libretro-picodrive-0~git20200112.tar.xz/Makefile.libretro -> libretro-picodrive-0~git20200716.tar.xz/Makefile.libretro
Changed
@@ -15,10 +15,6 @@ platform = win else ifneq ($(findstring Darwin,$(shell uname -a)),) platform = osx - arch = intel - ifeq ($(shell uname -p),powerpc) - arch = ppc - endif else ifneq ($(findstring win,$(shell uname -a)),) platform = win endif @@ -41,18 +37,11 @@ STATIC_LINKING:= 0 TARGET_NAME := picodrive LIBM := -lm -GIT_VERSION := " $(shell git rev-parse --short HEAD || echo unknown)" -ifneq ($(GIT_VERSION)," unknown") +GIT_VERSION ?= $(shell git rev-parse --short HEAD || echo unknown) +ifneq ($(GIT_VERSION),"unknown") CFLAGS += -DGIT_VERSION=\"$(GIT_VERSION)\" endif -asm_memory = 0 -asm_render = 0 -asm_ym2612 = 0 -asm_misc = 0 -asm_cdmemory = 0 -asm_mix = 0 - fpic := ifeq ($(STATIC_LINKING),1) @@ -67,7 +56,6 @@ SHARED := -shared DONT_COMPILE_IN_ZLIB = 1 CFLAGS += -DFAMEC_NO_GOTOS - use_sh2drc = 1 ifneq ($(findstring SunOS,$(shell uname -a)),) CC=gcc endif @@ -81,7 +69,6 @@ LIBM := DONT_COMPILE_IN_ZLIB = 1 CFLAGS += -DFAMEC_NO_GOTOS - use_sh2drc = 1 # OS X else ifeq ($(platform), osx) @@ -90,14 +77,9 @@ SHARED := -dynamiclib fpic := -fPIC APPLE := 1 - arch = intel ifeq ($(shell uname -p),powerpc) - arch = ppc - endif - ifeq ($(arch),ppc) + CFLAGS += -DHAVE_NO_LANGEXTRA CFLAGS += -DBLARGG_BIG_ENDIAN=1 -D__ppc__ -DFAMEC_NO_GOTOS - else - use_sh2drc = 1 endif OSXVER = `sw_vers -productVersion | cut -d. -f 2` OSX_LT_MAVERICKS = `(( $(OSXVER) <= 9)) && echo "YES"` @@ -119,15 +101,8 @@ CXX += -miphoneos-version-min=8.0 CC_AS += -miphoneos-version-min=8.0 CFLAGS += -miphoneos-version-min=8.0 - ARCH := arm STATIC_LINKING = 1 - use_cyclone = 0 - use_fame = 1 - use_drz80 = 0 - use_cz80 = 1 - use_sh2drc = 0 - use_svpdrc = 0 # iOS else ifneq (,$(findstring ios,$(platform))) @@ -141,14 +116,15 @@ ifeq ($(platform),ios-arm64) CC = clang -arch arm64 -isysroot $(IOSSDK) CXX = clang++ -arch arm64 -isysroot $(IOSSDK) - CFLAGS += -marm -DARM -D__aarch64__=1 + CFLAGS += -marm -DARM -D__aarch64__=1 else CC = clang -arch armv7 -isysroot $(IOSSDK) CXX = clang++ -arch armv7 -isysroot $(IOSSDK) - CFLAGS += -mcpu=cortex-a8 -mtune=cortex-a8 -mfpu=neon -marm + CC_AS = perl ./tools/gas-preprocessor.pl $(CC) + CFLAGS += -mcpu=cortex-a8 -mtune=cortex-a8 -mfpu=neon -marm ASFLAGS += -mcpu=cortex-a8 -mtune=cortex-a8 -mfpu=neon + NO_ARM_ASM = 1 endif - CC_AS = perl ./tools/gas-preprocessor.pl $(CC) CFLAGS += -DIOS ifeq ($(platform),$(filter $(platform),ios9 ios-arm64)) @@ -162,29 +138,9 @@ CC_AS += -miphoneos-version-min=5.0 CFLAGS += -miphoneos-version-min=5.0 endif - ARCH := arm - - use_cyclone = 0 - use_fame = 1 - use_drz80 = 0 - use_cz80 = 1 - ifeq ($(platform),ios-arm64) - use_sh2drc = 0 - use_svpdrc = 0 - else - use_sh2drc = 1 - use_svpdrc = 1 - endif # tvOS else ifeq ($(platform), tvos-arm64) - ARCH := arm - use_cyclone = 0 - use_fame = 1 - use_drz80 = 0 - use_cz80 = 1 - use_sh2drc = 0 - use_svpdrc = 0 TARGET := $(TARGET_NAME)_libretro_tvos.dylib SHARED := -dynamiclib fpic := -fPIC @@ -193,6 +149,9 @@ IOSSDK := $(shell xcodebuild -version -sdk appletvos Path) endif CC_AS = perl ./tools/gas-preprocessor.pl $(CC) + CC = clang -arch arm64 -isysroot $(IOSSDK) + CXX = clang++ -arch arm64 -isysroot $(IOSSDK) + CFLAGS += -marm -DARM -D__aarch64__=1 CFLAGS += -DIOS # PS3 @@ -205,20 +164,9 @@ NO_MMAP = 1 DONT_COMPILE_IN_ZLIB = 1 - asm_memory = 0 - asm_render = 0 - asm_ym2612 = 0 - asm_misc = 0 - asm_cdpico = 0 - asm_cdmemory = 0 - asm_mix = 0 - use_cyclone = 0 - use_fame = 1 - use_drz80 = 0 - use_cz80 = 1 - # sncps3 else ifeq ($(platform), sncps3) + ARCH = powerpc TARGET := $(TARGET_NAME)_libretro_ps3.a CC = $(CELL_SDK)/host-win32/sn/bin/ps3ppusnc.exe AR = $(CELL_SDK)/host-win32/sn/bin/ps3snarl.exe @@ -227,18 +175,6 @@ NO_MMAP = 1 DONT_COMPILE_IN_ZLIB = 1 - asm_memory = 0 - asm_render = 0 - asm_ym2612 = 0 - asm_misc = 0 - asm_cdpico = 0 - asm_cdmemory = 0 - asm_mix = 0 - use_cyclone = 0 - use_fame = 1 - use_drz80 = 0 - use_cz80 = 1 - # Lightweight PS3 Homebrew SDK else ifeq ($(platform), psl1ght) TARGET := $(TARGET_NAME)_libretro_$(platform).a @@ -249,130 +185,60 @@ NO_MMAP = 1 DONT_COMPILE_IN_ZLIB = 1 - asm_memory = 0 - asm_render = 0 - asm_ym2612 = 0 - asm_misc = 0 - asm_cdpico = 0 - asm_cdmemory = 0 - asm_mix = 0 - use_cyclone = 0 - use_fame = 1 - use_drz80 = 0 - use_cz80 = 1 - # PSP else ifeq ($(platform), psp1) - TARGET := $(TARGET_NAME)_libretro_$(platform).a - CC = psp-gcc$(EXE_EXT)
View file
libretro-picodrive-0~git20200716.tar.xz/README.md
Added
@@ -0,0 +1,70 @@ +This is my foray into dynamic recompilation using PicoDrive, a +Megadrive / Genesis / Sega CD / Mega CD / 32X / SMS emulator. + +I added support for MIPS (mips32r1), ARM64 (aarch64) and RISC-V (RV64IM) to the +SH2 recompiler, as well as spent much effort to optimize the DRC-generated code. +I also optimized SH2 memory access inside the emulator, and did some work on +M68K/SH2 CPU synchronization to fix some problems and speed up the emulator. + +It got a bit out of hand. I ended up doing fixes and optimizations all over the +place, mainly for 32X and CD, 32X graphics handling, and probably some more, +see the commit history. As a result, 32X emulation speed has improved a lot. + +### compiling + +I mainly worked with standalone PicoDrive versions as created by configure/make. +A list of platforms for which this is possible can be obtained with + +> configure --help + +If you want to build an executable for a unixoid platform not listed in the +platform list, just use + +> configure --platform=generic + +If DRC is available for the platform, it should be enabled automatically. + +For other platforms using a cross-compiling toolchain I used this, +assuming $TC points to the appropriate cross compile toolchain directory: + +platform|toolchain|configure command +--------|---------|----------------- +gp2x,wiz,caanoo|open2x|CROSS_COMPILE=arm-open2x-linux- CFLAGS="-I$TC/gcc-4.1.1-glibc-2.3.6/arm-open2x-linux/include" LDFLAGS="--sysroot $TC/gcc-4.1.1-glibc-2.3.6/arm-open2x-linux -L$TC/gcc-4.1.1-glibc-2.3.6/arm-open2x-linux/lib" ./configure --platform=gp2x +gp2x,wiz,caanoo|open2x with ubuntu arm gcc 4.7|CROSS_COMPILE=arm-linux-gnueabi- CFLAGS="-I$TC/gcc-4.1.1-glibc-2.3.6/arm-open2x-linux/include" LDFLAGS="-B$TC/gcc-4.1.1-glibc-2.3.6/lib/gcc/arm-open2x-linux/4.1.1 -B$TC/gcc-4.1.1-glibc-2.3.6/arm-open2x-linux/lib -L$TC/gcc-4.1.1-glibc-2.3.6/arm-open2x-linux/lib" ./configure --platform=gp2x +opendingux|opendingux|CROSS_COMPILE=mipsel-linux- CFLAGS="-I$TC/usr/include -I$TC/usr/include/SDL" LDFLAGS="--sysroot $TC -L$TC/lib" ./configure --platform=opendingux +opendingux|opendingux with ubuntu mips gcc 5.4|CROSS_COMPILE=mipsel-linux-gnu- CFLAGS="-I$TC/usr/include -I$TC/usr/include/SDL" LDFLAGS="-B$TC/usr/lib -B$TC/lib -Wl,-rpath-link=$TC/usr/lib -Wl,-rpath-link=$TC/lib" ./configure --platform=opendingux +gcw0|gcw0|CROSS_COMPILE=mipsel-gcw0-linux-uclibc- CFLAGS="-I$TC/usr/mipsel-gcw0-linux-uclibc/sysroot/usr/include -I$TC/usr/mipsel-gcw0-linux-uclibc/sysroot/usr/include/SDL" LDFLAGS="--sysroot $TC/usr/mipsel-gcw0-linux-uclibc/sysroot" ./configure --platform=gcw0 + +For gp2x, wiz, and caanoo you may need to compile libpng first. + +After configure, compile with + +> make opk # for opendingux and gcw0 +> +> make # for anything else + +### helix MP3 decoder + +For 32 bit ARM platforms, there is the possibility to compile the helix MP3 +decoder into a shared library to be able to use MP3 audio files with CD games. +The helix source files aren't supplied because of licensing issues. However, if +you have obtained the sources, put them into the platform/common/helix +directory, set CROSS to your cross compiler prefix (e.g. arm-linux-gnueabi-) +and LIBGCC to your cross compiler's libgcc.a +(e.g. /usr/lib/gcc-cross/arm-linux-gnueabi/4.7/libgcc.a), and compile with + +> make -C platform/common/helix CROSS=$CROSS LIBGCC=$LIBGCC + +Copy the resulting ${CROSS}helix_mp3.so as libhelix.so to the directory where +the PicoDrive binary is. + +### installing + +You need to install the resulting binary onto your device manually. +For opendingux and gcw0, copy the opk to your SD card. +For gp2x, wiz and caanoo, the easiest way is to unpack +[PicoDrive_191.zip](http://notaz.gp2x.de/releases/PicoDrive/PicoDrive_191.zip) +on your SD card and replace the PicoDrive binary. + +Send bug reports, fixes etc to <derkub@gmail.com> +Kai-Uwe Bloem
View file
libretro-picodrive-0~git20200112.tar.xz/configure -> libretro-picodrive-0~git20200716.tar.xz/configure
Changed
@@ -22,6 +22,13 @@ $c >> config.log 2>&1 } +check_option() +{ + echo 'void test(void) { }' >$TMPC + compile_object $1 || return 1 + return 0 +} + check_define() { $CC -E -dD $CFLAGS pico/arm_features.h | grep -q $1 || return 1 @@ -31,17 +38,18 @@ # setting options to "yes" or "no" will make that choice default, # "" means "autodetect". -platform_list="generic pandora gp2x opendingux rpi1 rpi2" +platform_list="generic pandora gp2x wiz caanoo opendingux gcw0 rpi1 rpi2" platform="generic" sound_driver_list="oss alsa sdl" sound_drivers="" have_armv5="" have_armv6="" have_armv7="" +have_arm_oabi="" have_arm_neon="" have_libavcodec="" need_sdl="no" -need_xlib="no" +need_zlib="no" # these are for known platforms optimize_cortexa8="no" optimize_cortexa7="no" @@ -54,7 +62,7 @@ CXX="${CXX-${CROSS_COMPILE}g++}" AS="${AS-${CROSS_COMPILE}as}" STRIP="${STRIP-${CROSS_COMPILE}strip}" -test -n "$SDL_CONFIG" || SDL_CONFIG="`$CC --print-sysroot 2> /dev/null || true`/usr/bin/sdl-config" +test -n "$SDL_CONFIG" || SDL_CONFIG="`$CC $CFLAGS $LDFLAGS --print-sysroot 2> /dev/null || true`/usr/bin/sdl-config" MAIN_LDLIBS="$LDLIBS -lm" config_mak="config.mak" @@ -78,23 +86,27 @@ ;; generic) ;; - opendingux) + opendingux | gcw0) sound_drivers="sdl" + # both are really an opendingux + platform="opendingux" ;; pandora) sound_drivers="oss alsa" optimize_cortexa8="yes" have_arm_neon="yes" ;; - gp2x) + gp2x | wiz | caanoo) sound_drivers="oss" optimize_arm920="yes" + # compile for OABI if toolchain provides it (faster code on caanoo) + have_arm_oabi="yes" + # always use static linking, since caanoo doesn't have OABI libs. Moreover, + # dynamic linking slows Wiz 1-10%, and libm on F100 isn't compatible + LDFLAGS="$LDFLAGS -static" + # unified binary for all of them CFLAGS="$CFLAGS -D__GP2X__" - if [ "$CROSS_COMPILE" = "arm-linux-" ]; then - # still using static, dynamic linking slows Wiz 1-10% - # also libm on F100 is not compatible - MAIN_LDLIBS="$MAIN_LDLIBS -static" - fi + platform="gp2x" ;; *) fail "unsupported platform: $platform" @@ -147,18 +159,11 @@ # fi #fi -# basic compiler test -cat > $TMPC <<EOF -int main(void) { return 0; } -EOF -if ! compile_binary; then - fail "compiler test failed, please check config.log" -fi - if [ -z "$ARCH" ]; then ARCH=`$CC -dumpmachine | awk -F '-' '{print $1}'` fi +# CPU/ABI stuff first, else compile test may fail case "$ARCH" in arm*) # ARM stuff @@ -206,26 +211,40 @@ have_armv5=`check_define HAVE_ARMV5 && echo yes` || true fi + # must disable thumb as recompiler can't handle it + if check_define __thumb__; then + CFLAGS="$CFLAGS -marm" + fi + # OABI/EABI selection + if [ "$have_arm_oabi" = "yes" ] && check_option -mabi=apcs-gnu; then + echo "$CFLAGS" | grep -q -- '-mabi=' || CFLAGS="$CFLAGS -mabi=apcs-gnu" + echo "$CFLAGS" | grep -q -- '-m\(no-\)*thumb-interwork' || CFLAGS="$CFLAGS -mno-thumb-interwork" + echo "$ASFLAGS" | grep -q -- '-mabi=' || ASFLAGS="$ASFLAGS -mabi=apcs-gnu" + fi + # automatically set mfpu and mfloat-abi if they are not set if [ "$have_arm_neon" = "yes" ]; then fpu="neon" + abi="hard" elif [ "$have_armv6" = "yes" ]; then fpu="vfp" + abi="softfp" + elif check_option -mfpu=fpa; then + fpu="fpa" # compatibility option for arm-linux-gnueabi-gcc on ubuntu + abi="soft" fi if [ "x$fpu" != "x" ]; then echo "$CFLAGS" | grep -q -- '-mfpu=' || CFLAGS="$CFLAGS -mfpu=$fpu" echo "$ASFLAGS" | grep -q -- '-mfpu=' || ASFLAGS="$ASFLAGS -mfpu=$fpu" - floatabi_set_by_gcc=`$CC -v 2>&1 | grep -q -- --with-float= && echo yes` || true - if [ "$floatabi_set_by_gcc" != "yes" ]; then - echo "$CFLAGS" | grep -q -- '-mfloat-abi=' || CFLAGS="$CFLAGS -mfloat-abi=softfp" - echo "$ASFLAGS" | grep -q -- '-mfloat-abi=' || ASFLAGS="$ASFLAGS -mfloat-abi=softfp" - fi + echo "$CFLAGS" | grep -q -- '-mfloat-abi=' || CFLAGS="$CFLAGS -mfloat-abi=$abi" + echo "$ASFLAGS" | grep -q -- '-mfloat-abi=' || ASFLAGS="$ASFLAGS -mfloat-abi=$abi" fi - # must disable thumb as recompiler can't handle it - if check_define __thumb__; then - CFLAGS="$CFLAGS -marm" - fi + # add -ldl for helix support + case "$MAIN_LDLIBS" in + *"-ldl"*) ;; + *) MAIN_LDLIBS="-ldl $MAIN_LDLIBS" ;; + esac # warn about common mistakes if [ "$platform" != "gp2x" -a "$have_armv5" != "yes" ]; then @@ -251,6 +270,14 @@ ;; esac +# basic compiler test +cat > $TMPC <<EOF +int main(void) { return 0; } +EOF +if ! compile_binary; then + fail "compiler test failed, please check config.log" +fi + # header/library presence tests check_zlib() { @@ -308,8 +335,7 @@ compile_object "$@" } -MAIN_LDLIBS="$MAIN_LDLIBS -lz" -check_zlib -lz || fail "please install zlib (libz-dev)" +check_zlib -lz &&MAIN_LDLIBS="$MAIN_LDLIBS -lz" || need_zlib="yes" MAIN_LDLIBS="-lpng $MAIN_LDLIBS" check_libpng || fail "please install libpng (libpng-dev)" @@ -352,10 +378,7 @@ check_sdl `$SDL_CONFIG --libs` || fail "please install libsdl (libsdl1.2-dev)" fi -cat > $TMPC <<EOF -void test(void *f, void *d) { fread(d, 1, 1, f); } -EOF -if compile_object -Wno-unused-result; then +if check_option -Wno-unused_result; then CFLAGS="$CFLAGS -Wno-unused-result" fi @@ -395,15 +418,18 @@ if [ "$have_libavcodec" = "yes" ]; then echo "HAVE_LIBAVCODEC = 1" >> $config_mak fi +if [ "$need_zlib" = "yes" ]; then + echo "PLATFORM_ZLIB = 1" >> $config_mak +fi # GP2X toolchains are too old for UAL asm, # so add this here to not litter main Makefile -if [ "$platform" = "g1p2x" ]; then - echo >> $config_mak - echo "%.o: %.S" >> $config_mak - echo " $(CC) $(CFLAGS) -E -c $^ -o /tmp/$(notdir $@).s" >> $config_mak
View file
libretro-picodrive-0~git20200716.tar.xz/cpu/DrZ80/drz80.S
Added
@@ -0,0 +1,8098 @@ +;@ Reesy's Z80 Emulator Version 0.001 + +;@ (c) Copyright 2004 Reesy, All rights reserved +;@ DrZ80 is free for non-commercial use. + +;@ For commercial use, separate licencing terms must be obtained. + +#include "../../pico/arm_features.h" + + .data + .align 4 + + .global DrZ80Run + .global DrZ80Ver + + .equiv INTERRUPT_MODE, 0 ;@0 = Use internal int handler, 1 = Use Mames int handler + .equiv FAST_Z80SP, 0 ;@0 = Use mem functions for stack pointer, 1 = Use direct mem pointer + .equiv UPDATE_CONTEXT, 0 + .equiv DRZ80_XMAP, 1 + .equiv DRZ80_XMAP_MORE_INLINE, 1 + +.if DRZ80_XMAP + .equ Z80_MEM_SHIFT, 13 +.endif + +.if INTERRUPT_MODE + .extern Interrupt +.endif + +DrZ80Ver: .long 0x0001 + +;@ --------------------------- Defines ---------------------------- +;@ Make sure that regs/pointers for z80pc to z80sp match up! + + z80_icount .req r3 + opcodes .req r4 + cpucontext .req r5 + z80pc .req r6 + z80a .req r7 + z80f .req r8 + z80bc .req r9 + z80de .req r10 + z80hl .req r11 + z80sp .req r12 + z80xx .req lr + + .equ z80pc_pointer, 0 ;@ 0 + .equ z80a_pointer, z80pc_pointer+4 ;@ 4 + .equ z80f_pointer, z80a_pointer+4 ;@ 8 + .equ z80bc_pointer, z80f_pointer+4 ;@ + .equ z80de_pointer, z80bc_pointer+4 + .equ z80hl_pointer, z80de_pointer+4 + .equ z80sp_pointer, z80hl_pointer+4 + .equ z80pc_base, z80sp_pointer+4 + .equ z80sp_base, z80pc_base+4 + .equ z80ix, z80sp_base+4 + .equ z80iy, z80ix+4 + .equ z80i, z80iy+4 + .equ z80a2, z80i+4 + .equ z80f2, z80a2+4 + .equ z80bc2, z80f2+4 + .equ z80de2, z80bc2+4 + .equ z80hl2, z80de2+4 + .equ cycles_pointer, z80hl2+4 + .equ previouspc, cycles_pointer+4 + .equ z80irq, previouspc+4 + .equ z80if, z80irq+1 + .equ z80im, z80if+1 + .equ z80r, z80im+1 + .equ z80irqvector, z80r+1 + .equ z80irqcallback, z80irqvector+4 + .equ z80_write8, z80irqcallback+4 + .equ z80_write16, z80_write8+4 + .equ z80_in, z80_write16+4 + .equ z80_out, z80_in+4 + .equ z80_read8, z80_out+4 + .equ z80_read16, z80_read8+4 + .equ z80_rebaseSP, z80_read16+4 + .equ z80_rebasePC, z80_rebaseSP+4 + + .equ VFlag, 0 + .equ CFlag, 1 + .equ ZFlag, 2 + .equ SFlag, 3 + .equ HFlag, 4 + .equ NFlag, 5 + .equ Flag3, 6 + .equ Flag5, 7 + + .equ Z80_CFlag, 0 + .equ Z80_NFlag, 1 + .equ Z80_VFlag, 2 + .equ Z80_Flag3, 3 + .equ Z80_HFlag, 4 + .equ Z80_Flag5, 5 + .equ Z80_ZFlag, 6 + .equ Z80_SFlag, 7 + + .equ Z80_IF1, 1<<0 + .equ Z80_IF2, 1<<1 + .equ Z80_HALT, 1<<2 + .equ Z80_NMI, 1<<3 + +;@--------------------------------------- + +.text + PIC_LDR_INIT() + +.if DRZ80_XMAP + +z80_xmap_read8: @ addr + ldr r1,[cpucontext,#z80_read8] + mov r2,r0,lsr #Z80_MEM_SHIFT + ldr r1,[r1,r2,lsl #2] + movs r1,r1,lsl #1 + ldrccb r0,[r1,r0] + bxcc lr + +z80_xmap_read8_handler: @ addr, func + str z80_icount,[cpucontext,#cycles_pointer] + stmfd sp!,{r12,lr} + mov lr,pc + bx r1 + ldr z80_icount,[cpucontext,#cycles_pointer] + ldmfd sp!,{r12,pc} + +z80_xmap_write8: @ data, addr + ldr r2,[cpucontext,#z80_write8] + add r2,r2,r1,lsr #Z80_MEM_SHIFT-2 + bic r2,r2,#3 + ldr r2,[r2] + movs r2,r2,lsl #1 + strccb r0,[r2,r1] + bxcc lr + +z80_xmap_write8_handler: @ data, addr, func + str z80_icount,[cpucontext,#cycles_pointer] + mov r3,r0 + mov r0,r1 + mov r1,r3 + stmfd sp!,{r12,lr} + mov lr,pc + bx r2 + ldr z80_icount,[cpucontext,#cycles_pointer] + ldmfd sp!,{r12,pc} + +z80_xmap_read16: @ addr + @ check if we cross bank boundary + add r1,r0,#1 + eor r1,r1,r0 + tst r1,#1<<Z80_MEM_SHIFT + bne 0f + + ldr r1,[cpucontext,#z80_read8] + mov r2,r0,lsr #Z80_MEM_SHIFT + ldr r1,[r1,r2,lsl #2] + movs r1,r1,lsl #1 + bcs 0f + ldrb r0,[r1,r0]! + ldrb r1,[r1,#1] + orr r0,r0,r1,lsl #8 + bx lr + +0: + @ z80_xmap_read8 will save r3 and r12 for us + stmfd sp!,{r8,r9,lr} + mov r8,r0 + bl z80_xmap_read8 + mov r9,r0 + add r0,r8,#1 + bl z80_xmap_read8 + orr r0,r9,r0,lsl #8 + ldmfd sp!,{r8,r9,pc} + +z80_xmap_write16: @ data, addr + add r2,r1,#1 + eor r2,r2,r1 + tst r2,#1<<Z80_MEM_SHIFT + bne 0f + + ldr r2,[cpucontext,#z80_write8] + add r2,r2,r1,lsr #Z80_MEM_SHIFT-2 + bic r2,r2,#3 + ldr r2,[r2] + movs r2,r2,lsl #1 + bcs 0f + strb r0,[r2,r1]! + mov r0,r0,lsr #8 + strb r0,[r2,#1] + bx lr + +0: + stmfd sp!,{r8,r9,lr} + mov r8,r0 + mov r9,r1 + bl z80_xmap_write8 + mov r0,r8,lsr #8 + add r1,r9,#1 + bl z80_xmap_write8
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/cz80/cz80.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/cz80/cz80.c
Changed
@@ -14,6 +14,7 @@ #include "cz80.h" #if PICODRIVE_HACKS +#include <pico/pico_int.h> #include <pico/memory.h> #endif @@ -277,7 +278,8 @@ CPU->ICount -= CPU->ExtraCycles; CPU->ExtraCycles = 0; } - goto Cz80_Exec; + if (!CPU->HaltState) + goto Cz80_Exec; } } else CPU->ICount = 0; @@ -287,6 +289,8 @@ #if CZ80_ENCRYPTED_ROM CPU->OPBase = OPBase; #endif + if (CPU->HaltState) + CPU->ICount = 0; cycles -= CPU->ICount; #if !CZ80_EMULATE_R_EXACTLY zR = (zR + (cycles >> 2)) & 0x7f;
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/cz80/cz80_op.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/cz80/cz80_op.c
Changed
@@ -687,13 +687,13 @@ OP(0x76): // HALT OP_HALT: CPU->HaltState = 1; - CPU->ICount = 0; goto Cz80_Check_Interrupt; OP(0xf3): // DI OP_DI: zIFF = 0; - RET(4) + USE_CYCLES(4) + goto Cz80_Exec_nocheck; OP(0xfb): // EI OP_EI:
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/drc/cmn.h -> libretro-picodrive-0~git20200716.tar.xz/cpu/drc/cmn.h
Changed
@@ -1,16 +1,44 @@ -#ifndef UTYPES_DEFINED -typedef unsigned char u8; -typedef signed char s8; -typedef unsigned short u16; -typedef signed short s16; -typedef unsigned int u32; -typedef signed int s32; -#endif -#define DRC_TCACHE_SIZE (2*1024*1024) +#define DRC_TCACHE_SIZE (4*1024*1024) extern u8 *tcache; void drc_cmn_init(void); void drc_cmn_cleanup(void); +#define BITMASK1(v0) (1 << (v0)) +#define BITMASK2(v0,v1) ((1 << (v0)) | (1 << (v1))) +#define BITMASK3(v0,v1,v2) (BITMASK2(v0,v1) | (1 << (v2))) +#define BITMASK4(v0,v1,v2,v3) (BITMASK3(v0,v1,v2) | (1 << (v3))) +#define BITMASK5(v0,v1,v2,v3,v4) (BITMASK4(v0,v1,v2,v3) | (1 << (v4))) +#define BITMASK6(v0,v1,v2,v3,v4,v5) (BITMASK5(v0,v1,v2,v3,v4) | (1 << (v5))) +#define BITRANGE(v0,v1) (BITMASK1(v1+1)-BITMASK1(v0)) // set with v0..v1 + +// binary search approach, since we don't have CLZ on ARM920T +#define FOR_ALL_BITS_SET_DO(mask, bit, code) { \ + u32 __mask = mask; \ + for (bit = 0; bit < 32 && mask; bit++, __mask >>= 1) { \ + if (!(__mask & 0xffff)) \ + bit += 16,__mask >>= 16; \ + if (!(__mask & 0xff)) \ + bit += 8, __mask >>= 8; \ + if (!(__mask & 0xf)) \ + bit += 4, __mask >>= 4; \ + if (!(__mask & 0x3)) \ + bit += 2, __mask >>= 2; \ + if (!(__mask & 0x1)) \ + bit += 1, __mask >>= 1; \ + if (__mask & 0x1) { \ + code; \ + } \ + } \ +} + +// inspired by https://graphics.stanford.edu/~seander/bithacks.html +static inline int count_bits(unsigned val) +{ + val = val - ((val >> 1) & 0x55555555); + val = (val & 0x33333333) + ((val >> 2) & 0x33333333); + return (((val + (val >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24; +} +
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/drc/emit_arm.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/drc/emit_arm.c
Changed
@@ -1,34 +1,192 @@ /* * Basic macros to emit ARM instructions and some utils * Copyright (C) 2008,2009,2010 notaz + * Copyright (C) 2019 kub * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. */ -#define CONTEXT_REG 11 -#define RET_REG 0 +#define HOST_REGS 16 + +// OABI/EABI: params: r0-r3, return: r0-r1, temp: r12,r14, saved: r4-r8,r10,r11 +// SP,PC: r13,r15 must not be used. saved: r9 (for platform use, e.g. on ios) +#define RET_REG 0 +#define PARAM_REGS { 0, 1, 2, 3 } +#ifndef __MACH__ +#define PRESERVED_REGS { 4, 5, 6, 7, 8, 9, 10, 11 } +#else +#define PRESERVED_REGS { 4, 5, 6, 7, 8, 10, 11 } // no r9.. +#endif +#define TEMPORARY_REGS { 12, 14 } + +#define CONTEXT_REG 11 +#define STATIC_SH2_REGS { SHR_SR,10 , SHR_R(0),8 , SHR_R(1),9 } // XXX: tcache_ptr type for SVP and SH2 compilers differs.. #define EMIT_PTR(ptr, x) \ do { \ *(u32 *)ptr = x; \ ptr = (void *)((u8 *)ptr + sizeof(u32)); \ - COUNT_OP; \ } while (0) -#define EMIT(x) EMIT_PTR(tcache_ptr, x) +// ARM special registers and peephole optimization flags +#define SP 13 // stack pointer +#define LR 14 // link (return address) +#define PC 15 // program counter +#define SR 16 // CPSR, status register +#define MEM 17 // memory access (src=LDR, dst=STR) +#define CYC1 20 // 1 cycle interlock (LDR, reg-cntrld shift) +#define CYC2 (CYC1+1)// 2+ cycles interlock (LDR[BH], MUL/MLA etc) +#define NO 32 // token for "no register" + +// bitmask builders +#define M1(x) (u32)(1ULL<<(x)) // u32 to have NO evaluate to 0 +#define M2(x,y) (M1(x)|M1(y)) +#define M3(x,y,z) (M2(x,y)|M1(z)) +#define M4(x,y,z,a) (M3(x,y,z)|M1(a)) +#define M5(x,y,z,a,b) (M4(x,y,z,a)|M1(b)) +#define M6(x,y,z,a,b,c) (M5(x,y,z,a,b)|M1(c)) +#define M10(a,b,c,d,e,f,g,h,i,j) (M5(a,b,c,d,e)|M5(f,g,h,i,j)) + +// avoid a warning with clang +static inline uintptr_t pabs(intptr_t v) { return labs(v); } + +// sys_cacheflush always flushes whole pages, and it's rather expensive on ARMs +// hold a list of pending cache updates and merge requests to reduce cacheflush +static struct { void *base, *end; } pageflush[4]; +static unsigned pagesize = 4096; + +static void emith_update_cache(void) +{ + int i; + + for (i = 0; i < 4 && pageflush[i].base; i++) { + cache_flush_d_inval_i(pageflush[i].base, pageflush[i].end + pagesize-1); + pageflush[i].base = NULL; + } +} + +static inline void emith_update_add(void *base, void *end) +{ + void *p_base = (void *)((uintptr_t)(base) & ~(pagesize-1)); + void *p_end = (void *)((uintptr_t)(end ) & ~(pagesize-1)); + int i; + + for (i = 0; i < 4 && pageflush[i].base; i++) { + if (p_base <= pageflush[i].end+pagesize && p_end >= pageflush[i].end) { + if (p_base < pageflush[i].base) pageflush[i].base = p_base; + pageflush[i].end = p_end; + return; + } + if (p_base <= pageflush[i].base && p_end >= pageflush[i].base-pagesize) { + if (p_end > pageflush[i].end) pageflush[i].end = p_end; + pageflush[i].base = p_base; + return; + } + } + if (i == 4) { + /* list full and not mergeable -> flush list */ + emith_update_cache(); + i = 0; + } + pageflush[i].base = p_base, pageflush[i].end = p_end; +} + +// peephole optimizer. ATM only tries to reduce interlock +#define EMIT_CACHE_SIZE 6 +struct emit_op { + u32 op; + u32 src, dst; +}; + +// peephole cache, last commited insn + cache + next insn = size+2 +static struct emit_op emit_cache[EMIT_CACHE_SIZE+2]; +static int emit_index; +#define emith_insn_ptr() (u8 *)((u32 *)tcache_ptr-emit_index) + +static inline void emith_pool_adjust(int tcache_offs, int move_offs); + +static NOINLINE void EMIT(u32 op, u32 dst, u32 src) +{ + void * emit_ptr = (u32 *)tcache_ptr - emit_index; + struct emit_op *const ptr = emit_cache; + const int n = emit_index+1; + int i, bi, bd = 0; + + // account for new insn in tcache + tcache_ptr = (void *)((u32 *)tcache_ptr + 1); + COUNT_OP; + // for conditional execution SR is always source + if (op < 0xe0000000 /*A_COND_AL << 28*/) + src |= M1(SR); + // put insn on back of queue // mask away the NO token + emit_cache[n] = (struct emit_op) + { .op=op, .src=src & ~M1(NO), .dst=dst & ~M1(NO) }; + // check insns down the queue as long as permitted by dependencies + for (bd = bi = 0, i = emit_index; i > 1 && !(dst & M1(PC)); i--) { + int deps = 0; + // dst deps between i and n must not be swapped, since any deps + // but [i].src & [n].src lead to changed semantics if swapped. + if ((ptr[i].dst & ptr[n].src) || (ptr[n].dst & ptr[i].src) || + (ptr[i].dst & ptr[n].dst)) + break; + // don't swap insns reading PC if it's not a word pool load + // (ptr[i].op&0xf700000) != EOP_C_AM2_IMM(0,0,0,1,0,0,0)) + if ((ptr[i].src & M1(PC)) && (ptr[i].op&0xf700000) != 0x5100000) + break; + + // calculate ARM920T interlock cycles (differences only) +#define D2(x,y) ((ptr[x].dst & ptr[y].src)?((ptr[x].src >> CYC2) & 1):0) +#define D1(x,y) ((ptr[x].dst & ptr[y].src)?((ptr[x].src >> CYC1) & 3):0) + // insn sequence: [..., i-2, i-1, i, i+1, ..., n-2, n-1, n] + deps -= D2(i-2,i)+D2(i-1,i+1)+D2(n-2,n ) + D1(i-1,i)+D1(n-1,n); + deps -= !!(ptr[n].src & M2(CYC1,CYC2));// favour moving LDR down + // insn sequence: [..., i-2, i-1, n, i, i+1, ..., n-2, n-1] + deps += D2(i-2,n)+D2(i-1,i )+D2(n ,i+1) + D1(i-1,n)+D1(n ,i); + deps += !!(ptr[i].src & M2(CYC1,CYC2));// penalize moving LDR up + // remember best match found + if (bd > deps) + bd = deps, bi = i; + } + // swap if fewer depencies + if (bd < 0) { + // make room for new insn at bi + struct emit_op tmp = ptr[n]; + for (i = n-1; i >= bi; i--) { + ptr[i+1] = ptr[i]; + if (ptr[i].src & M1(PC)) + emith_pool_adjust(n-i+1, 1); + } + // insert new insn at bi + ptr[bi] = tmp; + if (ptr[bi].src & M1(PC)) + emith_pool_adjust(1, bi-n); + } + if (dst & M1(PC)) { + // commit everything if a branch insn is emitted + for (i = 1; i <= emit_index+1; i++) + EMIT_PTR(emit_ptr, emit_cache[i].op); + emit_index = 0; + } else if (emit_index < EMIT_CACHE_SIZE) { + // queue not yet full + emit_index++; + } else { + // commit oldest insn from cache + EMIT_PTR(emit_ptr, emit_cache[1].op); + for (i = 0; i <= emit_index; i++) + emit_cache[i] = emit_cache[i+1]; + } +} + +static void emith_flush(void) +{ + int i; + void *emit_ptr = tcache_ptr - emit_index*sizeof(u32); -#define A_R4M (1 << 4) -#define A_R5M (1 << 5) -#define A_R6M (1 << 6) -#define A_R7M (1 << 7) -#define A_R8M (1 << 8) -#define A_R9M (1 << 9) -#define A_R10M (1 << 10) -#define A_R11M (1 << 11) -#define A_R12M (1 << 12) -#define A_R14M (1 << 14)
View file
libretro-picodrive-0~git20200716.tar.xz/cpu/drc/emit_arm64.c
Added
@@ -0,0 +1,1416 @@ +/* + * Basic macros to emit ARM A64 instructions and some utils + * Copyright (C) 2019 kub + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ +#define HOST_REGS 32 + +// AAPCS64: params: r0-r7, return: r0-r1, temp: r8-r17, saved: r19-r29 +// reserved: r18 (for platform use) +#define RET_REG 0 +#define PARAM_REGS { 0, 1, 2, 3, 4, 5, 6, 7 } +#define PRESERVED_REGS { 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 } +#define TEMPORARY_REGS { 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 } + +#define CONTEXT_REG 29 +#define STATIC_SH2_REGS { SHR_SR,28 , SHR_R(0),27 , SHR_R(1),26 } + +// R31 doesn't exist, it aliases either with zero or SP +#define SP 31 // stack pointer +#define Z0 31 // zero register +#define LR 30 // link register +#define FP 29 // frame pointer +#define PR 18 // platform register + +// All operations but ptr ops are using the lower 32 bits of the A64 registers. +// The upper 32 bits are only used in ptr ops and are zeroed by A64 32 bit ops. + + +#define A64_COND_EQ 0x0 +#define A64_COND_NE 0x1 +#define A64_COND_HS 0x2 +#define A64_COND_LO 0x3 +#define A64_COND_MI 0x4 +#define A64_COND_PL 0x5 +#define A64_COND_VS 0x6 +#define A64_COND_VC 0x7 +#define A64_COND_HI 0x8 +#define A64_COND_LS 0x9 +#define A64_COND_GE 0xa +#define A64_COND_LT 0xb +#define A64_COND_GT 0xc +#define A64_COND_LE 0xd +#define A64_COND_CS A64_COND_HS +#define A64_COND_CC A64_COND_LO +// "fake" conditions for T bit handling +#define A64_COND_AL 0xe +#define A64_COND_NV 0xf + +// DRC conditions +#define DCOND_EQ A64_COND_EQ +#define DCOND_NE A64_COND_NE +#define DCOND_MI A64_COND_MI +#define DCOND_PL A64_COND_PL +#define DCOND_HI A64_COND_HI +#define DCOND_HS A64_COND_HS +#define DCOND_LO A64_COND_LO +#define DCOND_GE A64_COND_GE +#define DCOND_GT A64_COND_GT +#define DCOND_LT A64_COND_LT +#define DCOND_LS A64_COND_LS +#define DCOND_LE A64_COND_LE +#define DCOND_VS A64_COND_VS +#define DCOND_VC A64_COND_VC + +#define DCOND_CS A64_COND_HS +#define DCOND_CC A64_COND_LO + + +// unified insn +#define A64_INSN(op, b29, b22, b21, b16, b12, b10, b5, b0) \ + (((op)<<25)|((b29)<<29)|((b22)<<22)|((b21)<<21)|((b16)<<16)|((b12)<<12)|((b10)<<10)|((b5)<<5)|((b0)<<0)) + +#define _ 0 // marker for "field unused" + +#define A64_NOP \ + A64_INSN(0xa,0x6,0x4,_,0x3,0x2,_,0,0x1f) // 0xd503201f + +// arithmetic/logical + +enum { OP_AND, OP_OR, OP_EOR, OP_ANDS, OP_ADD, OP_ADDS, OP_SUB, OP_SUBS }; +enum { ST_LSL, ST_LSR, ST_ASR, ST_ROR }; +enum { XT_UXTW=0x4, XT_UXTX=0x6, XT_LSL=0x7, XT_SXTW=0xc, XT_SXTX=0xe }; +#define OP_SZ64 (1 << 31) // bit for 64 bit op selection +#define OP_N64 (1 << 22) // N-bit for 64 bit logical immediate ops + +#define A64_OP_REG(op, n, rd, rn, rm, stype, simm) /* arith+logical, ST_ */ \ + A64_INSN(0x5,(op)&3,((op)&4)|stype,n,rm,_,simm,rn,rd) +#define A64_OP_XREG(op, rd, rn, rm, xtopt, simm) /* arith, XT_ */ \ + A64_INSN(0x5,(op)&3,0x4,1,rm,xtopt,simm,rn,rd) +#define A64_OP_IMM12(op, rd, rn, imm, lsl12) /* arith */ \ + A64_INSN(0x8,(op)&3,((op)&4)|lsl12,_,_,_,(imm)&0xfff,rn,rd) +#define A64_OP_IMMBM(op, rd, rn, immr, imms) /* logical */ \ + A64_INSN(0x9,(op)&3,0x0,_,immr,_,(imms)&0x3f,rn,rd) + +// rd = rn OP (rm SHIFT simm) +#define A64_ADD_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_ADD,0,rd,rn,rm,stype,simm) +#define A64_ADDS_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_ADDS,0,rd,rn,rm,stype,simm) +#define A64_SUB_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_SUB,0,rd,rn,rm,stype,simm) +#define A64_SUBS_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_SUBS,0,rd,rn,rm,stype,simm) + +#define A64_NEG_REG(rd, rm, stype, simm) \ + A64_SUB_REG(rd,Z0,rm,stype,simm) +#define A64_NEGS_REG(rd, rm, stype, simm) \ + A64_SUBS_REG(rd,Z0,rm,stype,simm) +#define A64_NEGC_REG(rd, rm) \ + A64_SBC_REG(rd,Z0,rm) +#define A64_NEGCS_REG(rd, rm) \ + A64_SBCS_REG(rd,Z0,rm) +#define A64_CMP_REG(rn, rm, stype, simm) \ + A64_SUBS_REG(Z0, rn, rm, stype, simm) +#define A64_CMN_REG(rn, rm, stype, simm) \ + A64_ADDS_REG(Z0, rn, rm, stype, simm) + +#define A64_EOR_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_EOR,0,rd,rn,rm,stype,simm) +#define A64_OR_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_OR,0,rd,rn,rm,stype,simm) +#define A64_ORN_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_OR,1,rd,rn,rm,stype,simm) +#define A64_AND_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_AND,0,rd,rn,rm,stype,simm) +#define A64_ANDS_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_ANDS,0,rd,rn,rm,stype,simm) +#define A64_BIC_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_AND,1,rd,rn,rm,stype,simm) +#define A64_BICS_REG(rd, rn, rm, stype, simm) \ + A64_OP_REG(OP_ANDS,1,rd,rn,rm,stype,simm) + +#define A64_TST_REG(rn, rm, stype, simm) \ + A64_ANDS_REG(Z0, rn, rm, stype, simm) +#define A64_MOV_REG(rd, rm, stype, simm) \ + A64_OR_REG(rd, Z0, rm, stype, simm) +#define A64_MVN_REG(rd, rm, stype, simm) \ + A64_ORN_REG(rd, Z0, rm, stype, simm) + +// rd = rn OP (rm EXTEND simm) +#define A64_ADD_XREG(rd, rn, rm, xtopt, simm) \ + A64_OP_XREG(OP_ADD,rd,rn,rm,xtopt,simm) +#define A64_ADDS_XREG(rd, rn, rm, xtopt, simm) \ + A64_OP_XREG(OP_ADDS,rd,rn,rm,xtopt,simm) +#define A64_SUB_XREG(rd, rn, rm, stype, simm) \ + A64_OP_XREG(OP_SUB,rd,rn,rm,xtopt,simm) +#define A64_SUBS_XREG(rd, rn, rm, stype, simm) \ + A64_OP_XREG(OP_SUBS,rd,rn,rm,xtopt,simm) + +// rd = rn OP rm OP carry +#define A64_ADC_REG(rd, rn, rm) \ + A64_INSN(0xd,OP_ADD &3,0x0,_,rm,_,_,rn,rd) +#define A64_ADCS_REG(rd, rn, rm) \ + A64_INSN(0xd,OP_ADDS&3,0x0,_,rm,_,_,rn,rd) +#define A64_SBC_REG(rd, rn, rm) \ + A64_INSN(0xd,OP_SUB &3,0x0,_,rm,_,_,rn,rd) +#define A64_SBCS_REG(rd, rn, rm) \ + A64_INSN(0xd,OP_SUBS&3,0x0,_,rm,_,_,rn,rd) + +// rd = rn SHIFT rm +#define A64_LSL_REG(rd, rn, rm) \ + A64_INSN(0xd,0x0,0x3,_,rm,_,0x8,rn,rd) +#define A64_LSR_REG(rd, rn, rm) \ + A64_INSN(0xd,0x0,0x3,_,rm,_,0xa,rn,rd) +#define A64_ASR_REG(rd, rn, rm) \ + A64_INSN(0xd,0x0,0x3,_,rm,_,0x9,rn,rd) +#define A64_ROR_REG(rd, rn, rm) \ + A64_INSN(0xd,0x0,0x3,_,rm,_,0xb,rn,rd) + +// rd = REVERSE(rn) +#define A64_RBIT_REG(rd, rn) \ + A64_INSN(0xd,0x2,0x3,_,_,_,_,rn,rd) + +// rd = rn OP (imm12 << (0|12)) +#define A64_ADD_IMM(rd, rn, imm12, lsl12) \ + A64_OP_IMM12(OP_ADD, rd, rn, imm12, lsl12) +#define A64_ADDS_IMM(rd, rn, imm12, lsl12) \ + A64_OP_IMM12(OP_ADDS, rd, rn, imm12, lsl12) +#define A64_SUB_IMM(rd, rn, imm12, lsl12) \ + A64_OP_IMM12(OP_SUB, rd, rn, imm12, lsl12) +#define A64_SUBS_IMM(rd, rn, imm12, lsl12) \ + A64_OP_IMM12(OP_SUBS, rd, rn, imm12, lsl12) + +#define A64_CMP_IMM(rn, imm12, lsl12) \ + A64_SUBS_IMM(Z0,rn,imm12,lsl12) +#define A64_CMN_IMM(rn, imm12, lsl12) \ + A64_ADDS_IMM(Z0,rn,imm12,lsl12) + +// rd = rn OP immbm; immbm is a repeated special pattern of 2^n bits length +#define A64_EOR_IMM(rd, rn, immr, imms) \ + A64_OP_IMMBM(OP_EOR,rd,rn,immr,imms) +#define A64_OR_IMM(rd, rn, immr, imms) \ + A64_OP_IMMBM(OP_OR,rd,rn,immr,imms) +#define A64_AND_IMM(rd, rn, immr, imms) \ + A64_OP_IMMBM(OP_AND,rd,rn,immr,imms) +#define A64_ANDS_IMM(rd, rn, immr, imms) \ + A64_OP_IMMBM(OP_ANDS,rd,rn,immr,imms)
View file
libretro-picodrive-0~git20200716.tar.xz/cpu/drc/emit_mips.c
Added
@@ -0,0 +1,1842 @@ +/* + * Basic macros to emit MIPS32/MIPS64 Release 1 or 2 instructions and some utils + * Copyright (C) 2019 kub + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ +#define HOST_REGS 32 + +// MIPS32 ABI: params: r4-r7, return: r2-r3, temp: r1(at),r8-r15,r24-r25,r31(ra) +// saved: r16-r23,r30, reserved: r0(zero), r26-r27(irq), r28(gp), r29(sp) +// r1,r15,r24,r25(at,t7-t9) are used internally by the code emitter +// MIPSN32/MIPS64 ABI: params: r4-r11, no caller-reserved save area on stack +#define RET_REG 2 // v0 +#define PARAM_REGS { 4, 5, 6, 7 } // a0-a3 +#define PRESERVED_REGS { 16, 17, 18, 19, 20, 21, 22, 23 } // s0-s7 +#define TEMPORARY_REGS { 2, 3, 8, 9, 10, 11, 12, 13, 14 } // v0-v1,t0-t6 + +#define CONTEXT_REG 23 // s7 +#define STATIC_SH2_REGS { SHR_SR,22 , SHR_R(0),21 , SHR_R(1),20 } + +// NB: the ubiquitous JZ74[46]0 uses MIPS32 Release 1, a slight MIPS II superset +#ifndef __mips_isa_rev +#define __mips_isa_rev 1 // surprisingly not always defined +#endif + +// registers usable for user code: r1-r25, others reserved or special +#define Z0 0 // zero register +#define GP 28 // global pointer +#define SP 29 // stack pointer +#define FP 30 // frame pointer +#define LR 31 // link register +// internally used by code emitter: +#define AT 1 // used to hold intermediate results +#define FNZ 15 // emulated processor flags: N (bit 31) ,Z (all bits) +#define FC 24 // emulated processor flags: C (bit 0), others 0 +#define FV 25 // emulated processor flags: Nt^Ns (bit 31). others x + +// All operations but ptr ops are using the lower 32 bits of the registers. +// The upper 32 bits always contain the sign extension from the lower 32 bits. + +// unified conditions; virtual, not corresponding to anything real on MIPS +#define DCOND_EQ 0x0 +#define DCOND_NE 0x1 +#define DCOND_HS 0x2 +#define DCOND_LO 0x3 +#define DCOND_MI 0x4 +#define DCOND_PL 0x5 +#define DCOND_VS 0x6 +#define DCOND_VC 0x7 +#define DCOND_HI 0x8 +#define DCOND_LS 0x9 +#define DCOND_GE 0xa +#define DCOND_LT 0xb +#define DCOND_GT 0xc +#define DCOND_LE 0xd + +#define DCOND_CS DCOND_LO +#define DCOND_CC DCOND_HS + +// unified insn +#define MIPS_INSN(op, rs, rt, rd, sa, fn) \ + (((op)<<26)|((rs)<<21)|((rt)<<16)|((rd)<<11)|((sa)<<6)|((fn)<<0)) + +#define _ 0 // marker for "field unused" +#define __(n) o##n // enum marker for "undefined" + +// opcode field (encoded in op) +enum { OP__FN=000, OP__RT, OP_J, OP_JAL, OP_BEQ, OP_BNE, OP_BLEZ, OP_BGTZ }; +enum { OP_ADDI=010, OP_ADDIU, OP_SLTI, OP_SLTIU, OP_ANDI, OP_ORI, OP_XORI, OP_LUI }; +enum { OP_DADDI=030, OP_DADDIU, OP_LDL, OP_LDR, OP__FN2=034, OP__FN3=037 }; +enum { OP_LB=040, OP_LH, OP_LWL, OP_LW, OP_LBU, OP_LHU, OP_LWR, OP_LWU }; +enum { OP_SB=050, OP_SH, OP_SWL, OP_SW, OP_SDL, OP_SDR, OP_SWR }; +enum { OP_SD=067, OP_LD=077 }; +// function field (encoded in fn if opcode = OP__FN) +enum { FN_SLL=000, __(01), FN_SRL, FN_SRA, FN_SLLV, __(05), FN_SRLV, FN_SRAV }; +enum { FN_JR=010, FN_JALR, FN_MOVZ, FN_MOVN, FN_SYNC=017 }; +enum { FN_MFHI=020, FN_MTHI, FN_MFLO, FN_MTLO, FN_DSSLV, __(25), FN_DSLRV, FN_DSRAV }; +enum { FN_MULT=030, FN_MULTU, FN_DIV, FN_DIVU, FN_DMULT, FN_DMULTU, FN_DDIV, FN_DDIVU }; +enum { FN_ADD=040, FN_ADDU, FN_SUB, FN_SUBU, FN_AND, FN_OR, FN_XOR, FN_NOR }; +enum { FN_SLT=052, FN_SLTU, FN_DADD, FN_DADDU, FN_DSUB, FN_DSUBU }; +enum { FN_DSLL=070, __(71), FN_DSRL, FN_DSRA, FN_DSLL32, __(75), FN_DSRL32, FN_DSRA32 }; +// function field (encoded in fn if opcode = OP__FN2) +enum { FN2_MADD=000, FN2_MADDU, FN2_MUL, __(03), FN2_MSUB, FN2_MSUBU }; +enum { FN2_CLZ=040, FN2_CLO, FN2_DCLZ=044, FN2_DCLO }; +// function field (encoded in fn if opcode = OP__FN3) +enum { FN3_EXT=000, FN3_DEXTM, FN3_DEXTU, FN3_DEXT, FN3_INS, FN3_DINSM, FN3_DINSU, FN3_DINS }; +enum { FN3_BSHFL=040, FN3_DBSHFL=044 }; +// rt field (encoded in rt if opcode = OP__RT) +enum { RT_BLTZ=000, RT_BGEZ, RT_BLTZAL=020, RT_BGEZAL, RT_SYNCI=037 }; + +// bit shuffle function (encoded in sa if function = FN3_BSHFL) +enum { BS_SBH=002, BS_SHD=005, BS_SEB=020, BS_SEH=030 }; +// r (rotate) bit function (encoded in rs/sa if function = FN_SRL/FN_SRLV) +enum { RB_SRL=0, RB_ROTR=1 }; + +#define MIPS_NOP 000 // null operation: SLL r0, r0, #0 + +// arithmetic/logical + +#define MIPS_OP_REG(op, sa, rd, rs, rt) \ + MIPS_INSN(OP__FN, rs, rt, rd, sa, op) // R-type, SPECIAL +#define MIPS_OP2_REG(op, sa, rd, rs, rt) \ + MIPS_INSN(OP__FN2, rs, rt, rd, sa, op) // R-type, SPECIAL2 +#define MIPS_OP3_REG(op, sa, rd, rs, rt) \ + MIPS_INSN(OP__FN3, rs, rt, rd, sa, op) // R-type, SPECIAL3 +#define MIPS_OP_IMM(op, rt, rs, imm) \ + MIPS_INSN(op, rs, rt, _, _, (u16)(imm)) // I-type + +// rd = rs OP rt +#define MIPS_ADD_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_ADDU,_, rd, rs, rt) +#define MIPS_DADD_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_DADDU,_, rd, rs, rt) +#define MIPS_SUB_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_SUBU,_, rd, rs, rt) +#define MIPS_DSUB_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_DSUBU,_, rd, rs, rt) + +#define MIPS_NEG_REG(rd, rt) \ + MIPS_SUB_REG(rd, Z0, rt) + +#define MIPS_XOR_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_XOR,_, rd, rs, rt) +#define MIPS_OR_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_OR,_, rd, rs, rt) +#define MIPS_AND_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_AND,_, rd, rs, rt) +#define MIPS_NOR_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_NOR,_, rd, rs, rt) + +#define MIPS_MOVE_REG(rd, rs) \ + MIPS_OR_REG(rd, rs, Z0) +#define MIPS_MVN_REG(rd, rs) \ + MIPS_NOR_REG(rd, rs, Z0) + +// rd = rt SHIFT rs +#define MIPS_LSL_REG(rd, rt, rs) \ + MIPS_OP_REG(FN_SLLV,_, rd, rs, rt) +#define MIPS_LSR_REG(rd, rt, rs) \ + MIPS_OP_REG(FN_SRLV,RB_SRL, rd, rs, rt) +#define MIPS_ASR_REG(rd, rt, rs) \ + MIPS_OP_REG(FN_SRAV,_, rd, rs, rt) +#define MIPS_ROR_REG(rd, rt, rs) \ + MIPS_OP_REG(FN_SRLV,RB_ROTR, rd, rs, rt) + +#define MIPS_SEB_REG(rd, rt) \ + MIPS_OP3_REG(FN3_BSHFL, BS_SEB, rd, _, rt) +#define MIPS_SEH_REG(rd, rt) \ + MIPS_OP3_REG(FN3_BSHFL, BS_SEH, rd, _, rt) + +#define MIPS_EXT_IMM(rt, rs, lsb, sz) \ + MIPS_OP3_REG(FN3_EXT, lsb, (sz)-1, rs, rt) +#define MIPS_INS_IMM(rt, rs, lsb, sz) \ + MIPS_OP3_REG(FN3_INS, lsb, (lsb)+(sz)-1, rs, rt) + +// rd = (rs < rt) +#define MIPS_SLT_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_SLT,_, rd, rs, rt) +#define MIPS_SLTU_REG(rd, rs, rt) \ + MIPS_OP_REG(FN_SLTU,_, rd, rs, rt) + +// rt = rs OP imm16 +#define MIPS_ADD_IMM(rt, rs, imm16) \ + MIPS_OP_IMM(OP_ADDIU, rt, rs, imm16) +#define MIPS_DADD_IMM(rt, rs, imm16) \ + MIPS_OP_IMM(OP_DADDIU, rt, rs, imm16) + +#define MIPS_XOR_IMM(rt, rs, imm16) \ + MIPS_OP_IMM(OP_XORI, rt, rs, imm16) +#define MIPS_OR_IMM(rt, rs, imm16) \ + MIPS_OP_IMM(OP_ORI, rt, rs, imm16) +#define MIPS_AND_IMM(rt, rs, imm16) \ + MIPS_OP_IMM(OP_ANDI, rt, rs, imm16) + +// rt = (imm16 << (0|16)) +#define MIPS_MOV_IMM(rt, imm16) \ + MIPS_OP_IMM(OP_ORI, rt, Z0, imm16) +#define MIPS_MOVT_IMM(rt, imm16) \ + MIPS_OP_IMM(OP_LUI, rt, _, imm16) + +// rd = rt SHIFT imm5 +#define MIPS_LSL_IMM(rd, rt, bits) \ + MIPS_INSN(OP__FN, _, rt, rd, bits, FN_SLL) +#define MIPS_LSR_IMM(rd, rt, bits) \ + MIPS_INSN(OP__FN, RB_SRL, rt, rd, bits, FN_SRL) +#define MIPS_ASR_IMM(rd, rt, bits) \ + MIPS_INSN(OP__FN, _, rt, rd, bits, FN_SRA) +#define MIPS_ROR_IMM(rd, rt, bits) \ + MIPS_INSN(OP__FN, RB_ROTR, rt, rd, bits, FN_SRL) + +#define MIPS_DLSL_IMM(rd, rt, bits) \ + MIPS_INSN(OP__FN, _, rt, rd, bits, FN_DSLL) +#define MIPS_DLSL32_IMM(rd, rt, bits) \ + MIPS_INSN(OP__FN, _, rt, rd, bits, FN_DSLL32) + +// rt = (rs < imm16) +#define MIPS_SLT_IMM(rt, rs, imm16) \ + MIPS_OP_IMM(OP_SLTI, rt, rs, imm16)
View file
libretro-picodrive-0~git20200716.tar.xz/cpu/drc/emit_ppc.c
Added
@@ -0,0 +1,1773 @@ +/* + * Basic macros to emit PowerISA 2.03 64 bit instructions and some utils + * Copyright (C) 2020 kub + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ + +// NB bit numbers are reversed in PPC (MSB is bit 0). The emith_* functions and +// macros must take this into account. + +// NB PPC was a 64 bit architecture from the onset, so basically all operations +// are operating on 64 bits. 32 bit arch was only added later on, and there are +// very few 32 bit operations (cmp*, shift/rotate, extract/insert, load/store). +// For most operations the upper bits don't spill into the lower word, for the +// others there is an appropriate 32 bit operation available. + +// NB PowerPC isn't a clean RISC design. Several insns use microcode, which is +// AFAIK notably slower than using some 2-3 non-microcode insns. So, using +// such insns should by avoided if possible. Listed in Cell handbook, App. A: +// - shift/rotate having the amount in a register +// - arithmetic/logical having the RC flag set (except cmp*) +// - load/store algebraic (l?a*), multiple (lmw/stmw), string (ls*/sts*) +// - mtcrf (and some more SPR related, not used here) +// moreover, misaligned load/store crossing a cacheline boundary are microcoded. +// Note also that load/store string isn't available in little endian mode. + +// NB flag handling in PPC differs grossly from the ARM/X86 model. There are 8 +// fields in the condition register, each having 4 condition bits. However, only +// the EQ bit is similar to the Z flag. The CA and OV bits in the XER register +// are similar to the C and V bits, but shifts don't use CA, and cmp* doesn't +// use CA and OV. +// Moreover, there's no easy possibility to get CA and OV for 32 bit arithmetic +// since all arithmetic/logical insns use 64 bit. +// For now, use the "no flags" code from the RISC-V backend. + +#define HOST_REGS 32 + +// PPC64: params: r3-r10, return: r3, temp: r0,r11-r12, saved: r14-r31 +// reserved: r0(zero), r1(stack), r2(TOC), r13(TID) +#define RET_REG 3 +#define PARAM_REGS { 3, 4, 5, 6, 7, 8, 9, 10 } +#define PRESERVED_REGS { 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 } +#define TEMPORARY_REGS { 11, 12 } + +#define CONTEXT_REG 31 +#define STATIC_SH2_REGS { SHR_SR,30 , SHR_R(0),29 , SHR_R(1),28 } + +// if RA is 0 in non-update memory insns, ADDI/ADDIS, ISEL, it aliases with zero +#define Z0 0 // zero register +#define SP 1 // stack pointer +// SPR registers +#define XER -1 // exception register +#define LR -8 // link register +#define CTR -9 // counter register +// internally used by code emitter: +#define AT 0 // emitter temporary (can't be fully used anyway) +#define FNZ 14 // emulated processor flags: N (bit 31) ,Z (all bits) +#define FC 15 // emulated processor flags: C (bit 0), others 0 +#define FV 16 // emulated processor flags: Nt^Ns (bit 31). others x + + +// unified conditions; virtual, not corresponding to anything real on PPC +#define DCOND_EQ 0x0 +#define DCOND_NE 0x1 +#define DCOND_HS 0x2 +#define DCOND_LO 0x3 +#define DCOND_MI 0x4 +#define DCOND_PL 0x5 +#define DCOND_VS 0x6 +#define DCOND_VC 0x7 +#define DCOND_HI 0x8 +#define DCOND_LS 0x9 +#define DCOND_GE 0xa +#define DCOND_LT 0xb +#define DCOND_GT 0xc +#define DCOND_LE 0xd + +#define DCOND_CS DCOND_LO +#define DCOND_CC DCOND_HS + +// unified insn; use right-aligned bit offsets for the bitfields +#define PPC_INSN(op, b10, b15, b20, b31) \ + (((op)<<26)|((b10)<<21)|((b15)<<16)|((b20)<<11)|((b31)<<0)) + +#define _ 0 // marker for "field unused" +#define __(n) o##n // enum marker for "undefined" +#define _CB(v,l,s,d) ((((v)>>(s))&((1<<(l))-1))<<(d)) // copy l bits + +// NB everything privileged or unneeded at 1st sight is left out +// opcode field (encoded in OPCD, bits 0-5) +enum { OP__LMA=004, OP_MULLI=007, + OP_SUBFIC, __(11), OP_CMPLI, OP_CMPI, OP_ADDIC, OP_ADDICF, OP_ADDI, OP_ADDIS, + OP_BC, __(21), OP_B, OP__CR, OP_RLWIMI, OP_RLWINM, __(26), OP_RLWNM, + OP_ORI, OP_ORIS, OP_XORI, OP_XORIS, OP_ANDI, OP_ANDIS, OP__RLD, OP__EXT, + OP_LWZ, OP_LWZU, OP_LBZ, OP_LBZU, OP_STW, OP_STWU, OP_STB, OP_STBU, + OP_LHZ, OP_LHZU, OP_LHA, OP_LHAU, OP_STH, OP_STHU, OP_LMW, OP_STMW, + /*OP_LQ=070,*/ OP__LD=072, OP__ST=076 }; +// CR subops (encoded in bits 21-31) +enum { OPC_MCRF=0, OPC_BCLR=32, OPC_BCCTR=1056 }; +// RLD subops (encoded in XO bits 27-31) +enum { OPR_RLDICL=0, OPR_RLDICR=4, OPR_RLDIC=8, OPR_RLDIMI=12, OPR_RLDCL=16, OPR_RLDCR=18 }; +// EXT subops (encoded in XO bits 21-31) +enum { + // arith/logical + OPE_CMP=0, OPE_SUBFC=16, OPE_ADDC=20, OPE_AND=56, + OPE_CMPL=64, OPE_SUBF=80, OPE_ANDC=120, OPE_NEG=208, OPE_NOR=248, + OPE_SUBFE=272, OPE_ADDE=276, OPE_SUBFZE=400, OPE_ADDZE=404, OPE_SUBFME=464, OPE_ADDME=468, + OPE_ADD=532, OPE_EQV=568, OPE_XOR=632, OPE_ORC=824, OPE_OR=888, OPE_NAND=952, + // shift + OPE_SLW=48, OPE_SLD=54, OPE_SRW=1072, OPE_SRD=1078, OPE_SRAW=1584, OPE_SRAD=1588, OPE_SRAWI=1648, OPE_SRADI=1652, + // extend, bitcount + OPE_CNTLZW=52, OPE_CNTLZD=116, OPE_EXTSH=1844, OPE_EXTSB=1908, OPE_EXTSW=1972, + // mult/div + OPE_MULHDU=18, OPE_MULHWU=22, OPE_MULHD=146, OPE_MULHW=150, OPE_MULLD=466, OPE_MULLW=470, + OPE_DIVDU=914, OPE_DIVWU=918, OPE_DIVD=978, OPE_DIVW=982, + // load/store indexed + OPE_LDX=42, OPE_LDUX=106, OPE_STDX=298, OPE_STDUX=362, + OPE_LWZX=46, OPE_LWZUX=110, OPE_LWAX=682, OPE_LWAUX=746, OPE_STWX=302, OPE_STWUX=366, + OPE_LBZX=174, OPE_LBZUX=238, /* no LBAX/LBAUX... */ OPE_STBX=430, OPE_STBUX=494, + OPE_LHZX=558, OPE_LHZUX=622, OPE_LHAX=686, OPE_LHAUX=750, OPE_STHX=814, OPE_STHUX=878, + // SPR, CR related + OPE_ISEL=15, OPE_MFCR=38, OPE_MTCRF=288, OPE_MFSPR=678, OPE_MTSPR=934, OPE_MCRXR=1024, +}; +// LD subops (encoded in XO bits 30-31) +enum { OPL_LD, OPL_LDU, OPL_LWA }; +// ST subops (encoded in XO bits 30-31) +enum { OPS_STD, OPS_STDU /*,OPS_STQ*/ }; + +// X*,M*-forms insns often have overflow detect in b21 and CR0 update in b31 +#define XOE (1<<10) // (31-21) +#define XRC (1<<0) // (31-31) +#define XF (XOE|XRC) +// MB and ME in M*-forms rotate left +#define MM(b,e) (((b)<<6)|((e)<<1)) +#define MD(b,s) (_CB(b,5,0,6)|_CB(b,1,5,5)|_CB(s,5,0,11)|_CB(s,1,5,1)) +// AA and LK in I,B-forms branches +#define BAA (1<<1) +#define BLK (1<<0) +// BO and BI condition codes in B-form, BO0-BO4:BI2-BI4 since we only need CR0 +#define BLT 0x60 +#define BGE 0x20 +#define BGT 0x61 +#define BLE 0x21 +#define BEQ 0x62 +#define BNE 0x22 +#define BXX 0xa0 // unconditional, aka always + +#define PPC_NOP \ + PPC_INSN(OP_ORI, 0, 0, _, 0) // ori r0, r0, 0 + +// arithmetic/logical + +#define PPC_OP_REG(op, xop, rt, ra, rb) /* X*,M*-form */ \ + PPC_INSN((unsigned)op, rt, ra, rb, xop) +#define PPC_OP_IMM(op, rt, ra, imm) /* D,B,I-form */ \ + PPC_INSN((unsigned)op, rt, ra, _, imm) + +// rt = ra OP rb +#define PPC_ADD_REG(rt, ra, rb) \ + PPC_OP_REG(OP__EXT,OPE_ADD,rt,ra,rb) +#define PPC_ADDC_REG(rt, ra, rb) \ + PPC_OP_REG(OP__EXT,OPE_ADD|XOE,rt,ra,rb) +#define PPC_SUB_REG(rt, rb, ra) /* NB reversed args (rb-ra) */ \ + PPC_OP_REG(OP__EXT,OPE_SUBF,rt,ra,rb) +#define PPC_SUBC_REG(rt, rb, ra) \ + PPC_OP_REG(OP__EXT,OPE_SUBF|XOE,rt,ra,rb) +#define PPC_NEG_REG(rt, ra) \ + PPC_OP_REG(OP__EXT,OPE_NEG,rt,ra,_) +#define PPC_NEGC_REG(rt, ra) \ + PPC_OP_REG(OP__EXT,OPE_NEG|XOE,rt,ra,_) + +#define PPC_CMP_REG(ra, rb) \ + PPC_OP_REG(OP__EXT,OPE_CMP,1,ra,rb) +#define PPC_CMPL_REG(ra, rb) \ + PPC_OP_REG(OP__EXT,OPE_CMPL,1,ra,rb) + +#define PPC_CMPW_REG(ra, rb) \ + PPC_OP_REG(OP__EXT,OPE_CMP,0,ra,rb) +#define PPC_CMPLW_REG(ra, rb) \ + PPC_OP_REG(OP__EXT,OPE_CMPL,0,ra,rb) + +#define PPC_XOR_REG(ra, rt, rb) \ + PPC_OP_REG(OP__EXT,OPE_XOR,rt,ra,rb) +#define PPC_OR_REG(ra, rt, rb) \ + PPC_OP_REG(OP__EXT,OPE_OR,rt,ra,rb) +#define PPC_ORN_REG(ra, rt, rb) \ + PPC_OP_REG(OP__EXT,OPE_ORC,rt,ra,rb) +#define PPC_NOR_REG(ra, rt, rb) \ + PPC_OP_REG(OP__EXT,OPE_NOR,rt,ra,rb) +#define PPC_AND_REG(ra, rt, rb) \ + PPC_OP_REG(OP__EXT,OPE_AND,rt,ra,rb) +#define PPC_BIC_REG(ra, rt, rb) \ + PPC_OP_REG(OP__EXT,OPE_ANDC,rt,ra,rb) + +#define PPC_MOV_REG(rt, ra) \ + PPC_OR_REG(rt, ra, ra) +#define PPC_MVN_REG(rt, ra) \ + PPC_NOR_REG(rt, ra, ra)
View file
libretro-picodrive-0~git20200716.tar.xz/cpu/drc/emit_riscv.c
Added
@@ -0,0 +1,1659 @@ +/* + * Basic macros to emit RISC-V RV64IM instructions and some utils + * Copyright (C) 2019 kub + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ +#define HOST_REGS 32 + +// RISC-V ABI: params: x10-x17, return: x10-x11, temp: x1(ra),x5-x7,x28-x31 +// saved: x8(fp),x9,x18-x27, reserved: x0(zero), x4(tp), x3(gp), x2(sp) +// x28-x31(t3-t6) are used internally by the code emitter +#define RET_REG 10 // a0 +#define PARAM_REGS { 10, 11, 12, 13, 14, 15, 16, 17 } // a0-a7 +#define PRESERVED_REGS { 9, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 } // s1-s11 +#define TEMPORARY_REGS { 5, 6, 7 } // t0-t2 + +#define CONTEXT_REG 9 // s1 +#define STATIC_SH2_REGS { SHR_SR,27 , SHR_R(0),26 , SHR_R(1),25 } + +// registers usable for user code: r1-r25, others reserved or special +#define Z0 0 // zero register +#define GP 3 // global pointer +#define SP 2 // stack pointer +#define FP 8 // frame pointer +#define LR 1 // link register +// internally used by code emitter: +#define AT 31 // used to hold intermediate results +#define FNZ 30 // emulated processor flags: N (bit 31) ,Z (all bits) +#define FC 29 // emulated processor flags: C (bit 0), others 0 +#define FV 28 // emulated processor flags: Nt^Ns (bit 31). others x + +// All operations but ptr ops are using the lower 32 bits of the registers. +// The upper 32 bits always contain the sign extension from the lower 32 bits. + +// unified conditions; virtual, not corresponding to anything real on RISC-V +#define DCOND_EQ 0x0 +#define DCOND_NE 0x1 +#define DCOND_HS 0x2 +#define DCOND_LO 0x3 +#define DCOND_MI 0x4 +#define DCOND_PL 0x5 +#define DCOND_VS 0x6 +#define DCOND_VC 0x7 +#define DCOND_HI 0x8 +#define DCOND_LS 0x9 +#define DCOND_GE 0xa +#define DCOND_LT 0xb +#define DCOND_GT 0xc +#define DCOND_LE 0xd + +#define DCOND_CS DCOND_LO +#define DCOND_CC DCOND_HS + +// unified insn +#define R5_INSN(b25, b20, b15, b12, b7, op) \ + (((b25)<<25)|((b20)<<20)|((b15)<<15)|((b12)<<12)|((b7)<<7)|((op)<<0)) + +#define _ 0 //marker for "field unused" +#define _CB(v,l,s,d) ((((v)>>(s))&((1<<(l))-1))<<(d)) // copy l bits + +#define R5_R_INSN(op, f1, f2, rd, rs, rt) \ + R5_INSN(f2, rt, rs, f1, rd, op) +#define R5_I_INSN(op, f1, rd, rs, imm) \ + R5_INSN(_, _CB(imm,12,0,0), rs, f1, rd, op) +#define R5_S_INSN(op, f1, rt, rs, imm) \ + R5_INSN(_CB(imm,7,5,0), rt, rs, f1, _CB(imm,5,0,0), op) +#define R5_U_INSN(op, rd, imm) \ + R5_INSN(_,_,_, _CB(imm,20,12,0), rd, op) +// oy vey... R5 immediate encoding in branches is really unwieldy :-/ +#define R5_B_INSN(op, f1, rt, rs, imm) \ + R5_INSN(_CB(imm,1,12,6)|_CB(imm,6,5,0), rt, rs, f1, \ + _CB(imm,4,1,1)|_CB(imm,1,11,0), op) +#define R5_J_INSN(op, rd, imm) \ + R5_INSN(_CB(imm,1,20,6)|_CB(imm,6,5,0), _CB(imm,4,1,1)|_CB(imm,1,11,0),\ + _CB(imm,8,12,0), rd, op) + +// opcode +enum { OP_LUI=0x37, OP_AUIPC=0x17, OP_JAL=0x6f, // 20-bit immediate + OP_JALR=0x67, OP_BCOND=0x63, OP_LD=0x03, OP_ST=0x23, // 12-bit immediate + OP_IMM=0x13, OP_REG=0x33, OP_IMM32=0x1b, OP_REG32=0x3b }; +// func3 +enum { F1_ADD, F1_SL, F1_SLT, F1_SLTU, F1_XOR, F1_SR, F1_OR, F1_AND };// IMM/REG +enum { F1_MUL, F1_MULH, F1_MULHSU, F1_MULHU, F1_DIV, F1_DIVU, F1_REM, F1_REMU }; +enum { F1_BEQ, F1_BNE, F1_BLT=4, F1_BGE, F1_BLTU, F1_BGEU }; // BCOND +enum { F1_B, F1_H, F1_W, F1_D, F1_BU, F1_HU, F1_WU }; // LD/ST +// func7 +enum { F2_ALT=0x20, F2_MULDIV=0x01 }; + +#define R5_NOP R5_I_INSN(OP_IMM, F1_ADD, Z0, Z0, 0) // nop: ADDI r0, r0, #0 + +// arithmetic/logical + +// rd = rs OP rt +#define R5_ADD_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_ADD, _, rd, rs, rt) +#define R5_SUB_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_ADD, F2_ALT, rd, rs, rt) + +#define R5_NEG_REG(rd, rt) \ + R5_SUB_REG(rd, Z0, rt) + +#define R5_XOR_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_XOR, _, rd, rs, rt) +#define R5_OR_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_OR , _, rd, rs, rt) +#define R5_AND_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_AND, _, rd, rs, rt) + +// rd = rs SHIFT rt +#define R5_LSL_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_SL , _, rd, rs, rt) +#define R5_LSR_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_SR , _, rd, rs, rt) +#define R5_ASR_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_SR , F2_ALT, rd, rs, rt) + +// rd = (rs < rt) +#define R5_SLT_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_SLT, _, rd, rs, rt) +#define R5_SLTU_REG(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_SLTU,_, rd, rs, rt) + +// rd = rs OP imm12 +#define R5_ADD_IMM(rd, rs, imm12) \ + R5_I_INSN(OP_IMM, F1_ADD , rd, rs, imm12) + +#define R5_XOR_IMM(rd, rs, imm12) \ + R5_I_INSN(OP_IMM, F1_XOR , rd, rs, imm12) +#define R5_OR_IMM(rd, rs, imm12) \ + R5_I_INSN(OP_IMM, F1_OR , rd, rs, imm12) +#define R5_AND_IMM(rd, rs, imm12) \ + R5_I_INSN(OP_IMM, F1_AND , rd, rs, imm12) + +#define R5_MOV_REG(rd, rs) \ + R5_ADD_IMM(rd, rs, 0) +#define R5_MVN_REG(rd, rs) \ + R5_XOR_IMM(rd, rs, -1) + +// rd = (imm12 << (0|12)) +#define R5_MOV_IMM(rd, imm12) \ + R5_OR_IMM(rd, Z0, imm12) +#define R5_MOVT_IMM(rd, imm20) \ + R5_U_INSN(OP_LUI, rd, imm20) +#define R5_MOVA_IMM(rd, imm20) \ + R5_U_INSN(OP_AUIPC, rd, imm20) + +// rd = rs SHIFT imm5/imm6 +#define R5_LSL_IMM(rd, rs, bits) \ + R5_R_INSN(OP_IMM, F1_SL , _, rd, rs, bits) +#define R5_LSR_IMM(rd, rs, bits) \ + R5_R_INSN(OP_IMM, F1_SR , _, rd, rs, bits) +#define R5_ASR_IMM(rd, rs, bits) \ + R5_R_INSN(OP_IMM, F1_SR , F2_ALT, rd, rs, bits) + +// rd = (rs < imm12) +#define R5_SLT_IMM(rd, rs, imm12) \ + R5_I_INSN(OP_IMM, F1_SLT , rd, rs, imm12) +#define R5_SLTU_IMM(rd, rs, imm12) \ + R5_I_INSN(OP_IMM, F1_SLTU, rd, rs, imm12) + +// multiplication + +#define R5_MULHU(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_MULHU, F2_MULDIV, rd, rs, rt) +#define R5_MULHS(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_MULH, F2_MULDIV, rd, rs, rt) +#define R5_MUL(rd, rs, rt) \ + R5_R_INSN(OP_REG, F1_MUL, F2_MULDIV, rd, rs, rt) + +// branching + +#define R5_J(imm20) \ + R5_J_INSN(OP_JAL, Z0, imm20) +#define R5_JAL(rd, imm20) \ + R5_J_INSN(OP_JAL, rd, imm20) +#define R5_JR(rs, offs12) \ + R5_I_INSN(OP_JALR, _, Z0, rs, offs12) +#define R5_JALR(rd, rs, offs12) \ + R5_I_INSN(OP_JALR, _, rd, rs, offs12) + +// conditional branches; no condition code, these compare rs against rt +#define R5_BCOND(cond, rs, rt, offs13) \ + R5_B_INSN(OP_BCOND, cond, rt, rs, offs13) +#define R5_BCONDZ(cond, rs, offs13) \ + R5_B_INSN(OP_BCOND, cond, Z0, rs, offs13) +#define R5_B(offs13) \ + R5_BCOND(F1_BEQ, Z0, Z0, offs13) + +// load/store indexed base + +#define R5_LW(rd, rs, offs12) \ + R5_I_INSN(OP_LD, F1_W, rd, rs, offs12) +#define R5_LH(rd, rs, offs12) \ + R5_I_INSN(OP_LD, F1_H, rd, rs, offs12) +#define R5_LB(rd, rs, offs12) \ + R5_I_INSN(OP_LD, F1_B, rd, rs, offs12) +#define R5_LHU(rd, rs, offs12) \ + R5_I_INSN(OP_LD, F1_HU, rd, rs, offs12)
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/drc/emit_x86.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/drc/emit_x86.c
Changed
@@ -1,6 +1,7 @@ /* * Basic macros to emit x86 instructions and some utils * Copyright (C) 2008,2009,2010 notaz + * Copyright (C) 2019 kub * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -13,10 +14,11 @@ */ #include <stdarg.h> -enum { xAX = 0, xCX, xDX, xBX, xSP, xBP, xSI, xDI }; +enum { xAX = 0, xCX, xDX, xBX, xSP, xBP, xSI, xDI, // x86-64,i386 common + xR8, xR9, xR10, xR11, xR12, xR13, xR14, xR15 }; // x86-64 only -#define CONTEXT_REG xBP -#define RET_REG xAX +#define CONTEXT_REG xBP +#define RET_REG xAX #define ICOND_JO 0x00 #define ICOND_JNO 0x01 @@ -51,6 +53,9 @@ #define DCOND_VS ICOND_JO // oVerflow Set #define DCOND_VC ICOND_JNO // oVerflow Clear +#define DCOND_CS ICOND_JB // carry set +#define DCOND_CC ICOND_JAE // carry clear + #define EMIT_PTR(ptr, val, type) \ *(type *)(ptr) = val @@ -61,7 +66,8 @@ #define EMIT_OP(op) do { \ COUNT_OP; \ - EMIT(op, u8); \ + if ((op) > 0xff) EMIT((op) >> 8, u8); \ + EMIT((u8)(op), u8); \ } while (0) #define EMIT_MODRM(mod, r, rm) do { \ @@ -106,45 +112,70 @@ EMIT_PTR(ptr + 1, (tcache_ptr - (ptr+2)), u8) // _r_r -#define emith_move_r_r(dst, src) \ - EMIT_OP_MODRM(0x8b, 3, dst, src) +#define emith_move_r_r(dst, src) do {\ + EMIT_REX_IF(0, dst, src); \ + EMIT_OP_MODRM64(0x8b, 3, dst, src); \ +} while (0) #define emith_move_r_r_ptr(dst, src) do { \ EMIT_REX_IF(1, dst, src); \ EMIT_OP_MODRM64(0x8b, 3, dst, src); \ } while (0) -#define emith_add_r_r(d, s) \ - EMIT_OP_MODRM(0x01, 3, s, d) +#define emith_add_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x01, 3, s, d); \ +} while (0) -#define emith_sub_r_r(d, s) \ - EMIT_OP_MODRM(0x29, 3, s, d) +#define emith_add_r_r_ptr(d, s) do { \ + EMIT_REX_IF(1, s, d); \ + EMIT_OP_MODRM64(0x01, 3, s, d); \ +} while (0) -#define emith_adc_r_r(d, s) \ - EMIT_OP_MODRM(0x11, 3, s, d) +#define emith_sub_r_r(d, s) do {\ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x29, 3, s, d); \ +} while (0) -#define emith_sbc_r_r(d, s) \ - EMIT_OP_MODRM(0x19, 3, s, d) /* SBB */ +#define emith_adc_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x11, 3, s, d); \ +} while (0) -#define emith_or_r_r(d, s) \ - EMIT_OP_MODRM(0x09, 3, s, d) +#define emith_sbc_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x19, 3, s, d); /* SBB */ \ +} while (0) -#define emith_and_r_r(d, s) \ - EMIT_OP_MODRM(0x21, 3, s, d) +#define emith_or_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x09, 3, s, d); \ +} while (0) -#define emith_eor_r_r(d, s) \ - EMIT_OP_MODRM(0x31, 3, s, d) /* XOR */ +#define emith_and_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x21, 3, s, d); \ +} while (0) -#define emith_tst_r_r(d, s) \ - EMIT_OP_MODRM(0x85, 3, s, d) /* TEST */ +#define emith_eor_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x31, 3, s, d); /* XOR */ \ +} while (0) + +#define emith_tst_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x85, 3, s, d); /* TEST */ \ +} while (0) #define emith_tst_r_r_ptr(d, s) do { \ EMIT_REX_IF(1, s, d); \ EMIT_OP_MODRM64(0x85, 3, s, d); /* TEST */ \ } while (0) -#define emith_cmp_r_r(d, s) \ - EMIT_OP_MODRM(0x39, 3, s, d) +#define emith_cmp_r_r(d, s) do { \ + EMIT_REX_IF(0, s, d); \ + EMIT_OP_MODRM64(0x39, 3, s, d); \ +} while (0) // fake teq - test equivalence - get_flags(d ^ s) #define emith_teq_r_r(d, s) do { \ @@ -156,7 +187,8 @@ #define emith_mvn_r_r(d, s) do { \ if (d != s) \ emith_move_r_r(d, s); \ - EMIT_OP_MODRM(0xf7, 3, 2, d); /* NOT d */ \ + EMIT_REX_IF(0, 0, d); \ + EMIT_OP_MODRM64(0xf7, 3, 2, d); /* NOT d */ \ } while (0) #define emith_negc_r_r(d, s) do { \ @@ -170,7 +202,8 @@ #define emith_neg_r_r(d, s) do { \ if (d != s) \ emith_move_r_r(d, s); \ - EMIT_OP_MODRM(0xf7, 3, 3, d); /* NEG d */ \ + EMIT_REX_IF(0, 0, d); \ + EMIT_OP_MODRM64(0xf7, 3, 3, d); /* NEG d */ \ } while (0) // _r_r_r @@ -185,6 +218,72 @@ } \ } while (0) +#define emith_add_r_r_r_ptr(d, s1, s2) do { \ + if (d == s1) { \ + emith_add_r_r_ptr(d, s2); \ + } else if (d == s2) { \ + emith_add_r_r_ptr(d, s1); \ + } else { \ + emith_move_r_r_ptr(d, s1); \ + emith_add_r_r_ptr(d, s2); \ + } \ +} while (0) + +#define emith_sub_r_r_r(d, s1, s2) do { \ + if (d == s1) { \ + emith_sub_r_r(d, s2); \ + } else if (d == s2) { \ + emith_sub_r_r(d, s1); \ + } else { \ + emith_move_r_r(d, s1); \ + emith_sub_r_r(d, s2); \ + } \ +} while (0) + +#define emith_adc_r_r_r(d, s1, s2) do { \ + if (d == s1) { \ + emith_adc_r_r(d, s2); \ + } else if (d == s2) { \ + emith_adc_r_r(d, s1); \ + } else { \ + emith_move_r_r(d, s1); \ + emith_adc_r_r(d, s2); \ + } \ +} while (0) + +#define emith_sbc_r_r_r(d, s1, s2) do { \ + if (d == s1) { \ + emith_sbc_r_r(d, s2); \ + } else if (d == s2) { \ + emith_sbc_r_r(d, s1); \ + } else { \ + emith_move_r_r(d, s1); \ + emith_sbc_r_r(d, s2); \ + } \
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/compiler.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/compiler.c
Changed
@@ -1,26 +1,30 @@ /* * SH2 recompiler * (C) notaz, 2009,2010,2013 + * (C) kub, 2018,2019,2020 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. * * notes: - * - tcache, block descriptor, link buffer overflows result in sh2_translate() - * failure, followed by full tcache invalidation for that region + * - tcache, block descriptor, block entry buffer overflows result in oldest + * blocks being deleted until enough space is available + * - link and list element buffer overflows result in failure and exit * - jumps between blocks are tracked for SMC handling (in block_entry->links), - * except jumps between different tcaches + * except jumps from global to CPU-local tcaches * * implemented: * - static register allocation * - remaining register caching and tracking in temporaries * - block-local branch linking - * - block linking (except between tcaches) + * - block linking * - some constant propagation + * - call stack caching for host block entry address + * - delay, poll, and idle loop detection and handling + * - some T/M flag optimizations where the value is known or isn't used * * TODO: * - better constant propagation - * - stack caching? * - bug fixing */ #include <stddef.h> @@ -38,14 +42,18 @@ // features #define PROPAGATE_CONSTANTS 1 #define LINK_BRANCHES 1 - -// limits (per block) -#define MAX_BLOCK_SIZE (BLOCK_INSN_LIMIT * 6 * 6) - -// max literal offset from the block end -#define MAX_LITERAL_OFFSET 32*2 -#define MAX_LITERALS (BLOCK_INSN_LIMIT / 4) -#define MAX_LOCAL_BRANCHES 32 +#define BRANCH_CACHE 1 +#define CALL_STACK 1 +#define ALIAS_REGISTERS 1 +#define REMAP_REGISTER 1 +#define LOOP_DETECTION 1 +#define LOOP_OPTIMIZER 1 +#define T_OPTIMIZER 1 +#define DIV_OPTIMIZER 0 + +#define MAX_LITERAL_OFFSET 0x200 // max. MOVA, MOV @(PC) offset +#define MAX_LOCAL_TARGETS (BLOCK_INSN_LIMIT / 4) +#define MAX_LOCAL_BRANCHES (BLOCK_INSN_LIMIT / 2) // debug stuff // 01 - warnings/errors @@ -53,9 +61,16 @@ // 04 - asm // 08 - runtime block entry log // 10 - smc self-check +// 20 - runtime block entry counter +// 40 - rcache checking +// 80 - branch cache statistics +// 100 - write trace +// 200 - compare trace +// 400 - block entry backtrace on exit +// 800 - state dump on exit // { #ifndef DRC_DEBUG -#define DRC_DEBUG 0 +#define DRC_DEBUG 0//x847 #endif #if DRC_DEBUG @@ -73,6 +88,7 @@ #define dbg(...) #endif + /// #define FETCH_OP(pc) \ dr_pc_base[(pc) / 2] @@ -93,13 +109,21 @@ #define GET_Rn() \ ((op >> 8) & 0x0f) -#define BITMASK1(v0) (1 << (v0)) -#define BITMASK2(v0,v1) ((1 << (v0)) | (1 << (v1))) -#define BITMASK3(v0,v1,v2) (BITMASK2(v0,v1) | (1 << (v2))) -#define BITMASK4(v0,v1,v2,v3) (BITMASK3(v0,v1,v2) | (1 << (v3))) -#define BITMASK5(v0,v1,v2,v3,v4) (BITMASK4(v0,v1,v2,v3) | (1 << (v4))) +#define SHR_T 30 // separate T for not-used detection +#define SHR_MEM 31 +#define SHR_TMP -1 -#define SHR_T SHR_SR // might make them separate someday +#define T 0x00000001 +#define S 0x00000002 +#define I 0x000000f0 +#define Q 0x00000100 +#define M 0x00000200 +#define T_save 0x00000800 + +#define I_SHIFT 4 +#define Q_SHIFT 8 +#define M_SHIFT 9 +#define T_SHIFT 11 static struct op_data { u8 op; @@ -115,319 +139,438 @@ enum op_types { OP_UNHANDLED = 0, OP_BRANCH, + OP_BRANCH_N, // conditional known not to be taken OP_BRANCH_CT, // conditional, branch if T set OP_BRANCH_CF, // conditional, branch if T clear OP_BRANCH_R, // indirect OP_BRANCH_RF, // indirect far (PC + Rm) OP_SETCLRT, // T flag set/clear OP_MOVE, // register move + OP_LOAD_CONST,// load const to register OP_LOAD_POOL, // literal pool load, imm is address - OP_MOVA, - OP_SLEEP, - OP_RTE, + OP_MOVA, // MOVA instruction + OP_SLEEP, // SLEEP instruction + OP_RTE, // RTE instruction + OP_TRAPA, // TRAPA instruction + OP_LDC, // LDC instruction + OP_DIV0, // DIV0[US] instruction + OP_UNDEFINED, }; -#ifdef DRC_SH2 +struct div { + u32 state:1; // 0: expect DIV1/ROTCL, 1: expect DIV1 + u32 rn:5, rm:5, ro:5; // rn and rm for DIV1, ro for ROTCL + u32 div1:8, rotcl:8; // DIV1 count, ROTCL count +}; +union _div { u32 imm; struct div div; }; // XXX tut-tut type punning... +#define div(opd) ((union _div *)&((opd)->imm))->div -static int literal_disabled_frames; +// XXX consider trap insns: OP_TRAPA, OP_UNDEFINED? +#define OP_ISBRANCH(op) ((BITRANGE(OP_BRANCH, OP_BRANCH_RF)| BITMASK1(OP_RTE)) \ + & BITMASK1(op)) +#define OP_ISBRAUC(op) (BITMASK4(OP_BRANCH, OP_BRANCH_R, OP_BRANCH_RF, OP_RTE) \ + & BITMASK1(op)) +#define OP_ISBRACND(op) (BITMASK2(OP_BRANCH_CT, OP_BRANCH_CF) \ + & BITMASK1(op)) +#define OP_ISBRAIMM(op) (BITMASK3(OP_BRANCH, OP_BRANCH_CT, OP_BRANCH_CF) \ + & BITMASK1(op)) +#define OP_ISBRAIND(op) (BITMASK3(OP_BRANCH_R, OP_BRANCH_RF, OP_RTE) \ + & BITMASK1(op)) + +#ifdef DRC_SH2 #if (DRC_DEBUG & 4) static u8 *tcache_dsm_ptrs[3]; static char sh2dasm_buff[64]; #define do_host_disasm(tcid) \ - host_dasm(tcache_dsm_ptrs[tcid], tcache_ptr - tcache_dsm_ptrs[tcid]); \ - tcache_dsm_ptrs[tcid] = tcache_ptr + host_dasm(tcache_dsm_ptrs[tcid], emith_insn_ptr() - tcache_dsm_ptrs[tcid]); \ + tcache_dsm_ptrs[tcid] = emith_insn_ptr() #else #define do_host_disasm(x) #endif -#if (DRC_DEBUG & 8) || defined(PDB) +#define SH2_DUMP(sh2, reason) { \ + char ms = (sh2)->is_slave ? 's' : 'm'; \ + printf("%csh2 %s %08x\n", ms, reason, (sh2)->pc); \ + printf("%csh2 r0-7 %08x %08x %08x %08x %08x %08x %08x %08x\n", ms, \ + (sh2)->r[0], (sh2)->r[1], (sh2)->r[2], (sh2)->r[3], \ + (sh2)->r[4], (sh2)->r[5], (sh2)->r[6], (sh2)->r[7]); \ + printf("%csh2 r8-15 %08x %08x %08x %08x %08x %08x %08x %08x\n", ms, \ + (sh2)->r[8], (sh2)->r[9], (sh2)->r[10], (sh2)->r[11], \ + (sh2)->r[12], (sh2)->r[13], (sh2)->r[14], (sh2)->r[15]); \ + printf("%csh2 pc-ml %08x %08x %08x %08x %08x %08x %08x %08x\n", ms, \ + (sh2)->pc, (sh2)->ppc, (sh2)->pr, (sh2)->sr&0xfff, \ + (sh2)->gbr, (sh2)->vbr, (sh2)->mach, (sh2)->macl); \ + printf("%csh2 tmp-p %08x %08x %08x %08x %08x %08x %08x %08x\n", ms, \ + (sh2)->drc_tmp, (sh2)->irq_cycles, \ + (sh2)->pdb_io_csum[0], (sh2)->pdb_io_csum[1], (sh2)->state, \ + (sh2)->poll_addr, (sh2)->poll_cycles, (sh2)->poll_cnt); \ +} + +#if (DRC_DEBUG & (8|256|512|1024)) || defined(PDB) +#if (DRC_DEBUG & (256|512|1024)) +static SH2 csh2[2][8];
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/compiler.h -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/compiler.h
Changed
@@ -1,27 +1,86 @@ int sh2_drc_init(SH2 *sh2); void sh2_drc_finish(SH2 *sh2); -void sh2_drc_wcheck_ram(unsigned int a, int val, int cpuid); -void sh2_drc_wcheck_da(unsigned int a, int val, int cpuid); +void sh2_drc_wcheck_ram(uint32_t a, unsigned len, SH2 *sh2); +void sh2_drc_wcheck_da(uint32_t a, unsigned len, SH2 *sh2); #ifdef DRC_SH2 void sh2_drc_mem_setup(SH2 *sh2); void sh2_drc_flush_all(void); -void sh2_drc_frame(void); #else #define sh2_drc_mem_setup(x) #define sh2_drc_flush_all() #define sh2_drc_frame() #endif -#define BLOCK_INSN_LIMIT 128 +#define BLOCK_INSN_LIMIT 1024 /* op_flags */ #define OF_DELAY_OP (1 << 0) #define OF_BTARGET (1 << 1) -#define OF_T_SET (1 << 2) // T is known to be set -#define OF_T_CLEAR (1 << 3) // ... clear +#define OF_LOOP (3 << 2) // NONE, IDLE, DELAY, POLL loop #define OF_B_IN_DS (1 << 4) +#define OF_DELAY_INSN (1 << 5) // DT, (TODO ADD+CMP?) +#define OF_POLL_INSN (1 << 6) // MOV @(...),Rn (no post increment), TST @(...) +#define OF_BASIC_LOOP (1 << 7) // pinnable loop without any branches in it -void scan_block(unsigned int base_pc, int is_slave, - unsigned char *op_flags, unsigned int *end_pc, - unsigned int *end_literals); +#define OF_IDLE_LOOP (1 << 2) +#define OF_DELAY_LOOP (2 << 2) +#define OF_POLL_LOOP (3 << 2) + +unsigned short scan_block(uint32_t base_pc, int is_slave, + unsigned char *op_flags, uint32_t *end_pc, + uint32_t *base_literals, uint32_t *end_literals); + +#if defined(DRC_SH2) && defined(__GNUC__) && !defined(__clang__) +// direct access to some host CPU registers used by the DRC if gcc is used. +// XXX MUST match SHR_SR definitions in cpu/drc/emit_*.c; should be moved there +// XXX yuck, there's no portable way to determine register size. Use long long +// if target is 64 bit and data model is ILP32 or LLP64(windows), else long +#if defined(__arm__) +#define DRC_SR_REG "r10" +#define DRC_REG_LL 0 // 32 bit +#elif defined(__aarch64__) +#define DRC_SR_REG "r28" +#define DRC_REG_LL (__ILP32__ || _WIN32) +#elif defined(__mips__) +#define DRC_SR_REG "s6" +#define DRC_REG_LL (_MIPS_SIM == _ABIN32) +#elif defined(__riscv__) || defined(__riscv) +#define DRC_SR_REG "s11" +#define DRC_REG_LL 0 // no ABI for (__ILP32__ && __riscv_xlen != 32) +#elif defined(__powerpc__) +#define DRC_SR_REG "r30" +#define DRC_REG_LL 0 // no ABI for __ILP32__ +#elif defined(__i386__) +#define DRC_SR_REG "edi" +#define DRC_REG_LL 0 // 32 bit +#elif defined(__x86_64__) +#define DRC_SR_REG "rbx" +#define DRC_REG_LL (__ILP32__ || _WIN32) +#endif +#endif + +#ifdef DRC_SR_REG +// XXX this is more clear but produces too much overhead for slow platforms +extern void REGPARM(1) (*sh2_drc_save_sr)(SH2 *sh2); +extern void REGPARM(1) (*sh2_drc_restore_sr)(SH2 *sh2); + +// NB: sh2_sr MUST have register size if optimizing with -O3 (-fif-conversion) +#if DRC_REG_LL +#define DRC_DECLARE_SR register long long _sh2_sr asm(DRC_SR_REG) +#else +#define DRC_DECLARE_SR register long _sh2_sr asm(DRC_SR_REG) +#endif +#define DRC_SAVE_SR(sh2) \ + if (likely(sh2->state & SH2_IN_DRC)) \ + sh2->sr = (s32)_sh2_sr +// sh2_drc_save_sr(sh2) +#define DRC_RESTORE_SR(sh2) \ + if (likely(sh2->state & SH2_IN_DRC)) \ + _sh2_sr = (s32)sh2->sr +// sh2_drc_restore_sr(sh2) +#else +#define DRC_DECLARE_SR +#define DRC_SAVE_SR(sh2) +#define DRC_RESTORE_SR(sh2) +#endif
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/mame/sh2.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/mame/sh2.c
Changed
@@ -372,7 +372,7 @@ #if BUSY_LOOP_HACKS if (disp == -2) { - UINT32 next_opcode = RW( sh2, sh2->ppc & AM ); + UINT32 next_opcode = (UINT32)(UINT16)RW( sh2, sh2->ppc & AM ); /* BRA $ * NOP */ @@ -802,7 +802,7 @@ sh2->sr &= ~T; #if BUSY_LOOP_HACKS { - UINT32 next_opcode = RW( sh2, sh2->ppc & AM ); + UINT32 next_opcode = (UINT32)(UINT16)RW( sh2, sh2->ppc & AM ); /* DT Rn * BF $-2 */ @@ -1049,12 +1049,12 @@ INT32 tempm, tempn, dest, src, ans; UINT32 templ; - tempn = (INT32) RW( sh2, sh2->r[n] ); + tempn = (INT32)(INT16) RW( sh2, sh2->r[n] ); sh2->r[n] += 2; - tempm = (INT32) RW( sh2, sh2->r[m] ); + tempm = (INT32)(INT16) RW( sh2, sh2->r[m] ); sh2->r[m] += 2; templ = sh2->macl; - tempm = ((INT32) (short) tempn * (INT32) (short) tempm); + tempm = (tempn * tempm); if ((INT32) sh2->macl >= 0) dest = 0; else
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/mame/sh2dasm.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/mame/sh2dasm.c
Changed
@@ -465,7 +465,7 @@ sprintf(buffer, "MOV.B @($%02X,%s),R0", (opcode & 15), regname[Rm]); break; case 5: - sprintf(buffer, "MOV.W @($%02X,%s),R0", (opcode & 15), regname[Rm]); + sprintf(buffer, "MOV.W @($%02X,%s),R0", (opcode & 15) * 2, regname[Rm]); break; case 8: sprintf(buffer, "CMP/EQ #$%02X,R0", (opcode & 0xff));
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/mame/sh2pico.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/mame/sh2pico.c
Changed
@@ -121,7 +121,7 @@ if (sh2->delay) { sh2->ppc = sh2->delay; - opcode = RW(sh2, sh2->delay); + opcode = (UINT32)(UINT16)RW(sh2, sh2->delay); // TODO: more branch types if ((opcode >> 13) == 5) { // BRA/BSR @@ -139,7 +139,7 @@ else { sh2->ppc = sh2->pc; - opcode = RW(sh2, sh2->pc); + opcode = (UINT32)(UINT16)RW(sh2, sh2->pc); } sh2->delay = 0; @@ -214,7 +214,7 @@ if (sh2->pc < *base_pc || sh2->pc >= *end_pc) { *base_pc = sh2->pc; scan_block(*base_pc, sh2->is_slave, - op_flags, end_pc, NULL); + op_flags, end_pc, NULL, NULL); } if ((op_flags[(sh2->pc - *base_pc) / 2] & OF_BTARGET) || sh2->pc == *base_pc @@ -232,13 +232,13 @@ if (sh2->delay) { sh2->ppc = sh2->delay; - opcode = RW(sh2, sh2->delay); + opcode = (UINT32)(UINT16)RW(sh2, sh2->delay); sh2->pc -= 2; } else { sh2->ppc = sh2->pc; - opcode = RW(sh2, sh2->pc); + opcode = (UINT32)(UINT16)RW(sh2, sh2->pc); } sh2->delay = 0;
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/sh2.c -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/sh2.c
Changed
@@ -84,7 +84,7 @@ // do this to avoid missing irqs that other SH2 might clear int vector = sh2->irq_callback(sh2, level); sh2_do_irq(sh2, level, vector); - sh2->m68krcycles_done += C_SH2_TO_M68K(*sh2, 13); + sh2->m68krcycles_done += C_SH2_TO_M68K(sh2, 13); } else sh2->test_irq = 1;
View file
libretro-picodrive-0~git20200112.tar.xz/cpu/sh2/sh2.h -> libretro-picodrive-0~git20200716.tar.xz/cpu/sh2/sh2.h
Changed
@@ -1,6 +1,7 @@ #ifndef __SH2_H__ #define __SH2_H__ +#include "../../pico/pico_types.h" #include "../../pico/pico_port.h" // registers - matches structure order @@ -8,42 +9,58 @@ SHR_R0 = 0, SHR_SP = 15, SHR_PC, SHR_PPC, SHR_PR, SHR_SR, SHR_GBR, SHR_VBR, SHR_MACH, SHR_MACL, + SH2_REGS // register set size } sh2_reg_e; +#define SHR_R(n) (SHR_R0+(n)) typedef struct SH2_ { - unsigned int r[16]; // 00 - unsigned int pc; // 40 - unsigned int ppc; - unsigned int pr; - unsigned int sr; - unsigned int gbr, vbr; // 50 - unsigned int mach, macl; // 58 + // registers. this MUST correlate with enum sh2_reg_e. + uint32_t r[16] ALIGNED(32); + uint32_t pc; // 40 + uint32_t ppc; + uint32_t pr; + uint32_t sr; + uint32_t gbr, vbr; // 50 + uint32_t mach, macl; // 58 // common - const void *read8_map; // 60 + const void *read8_map; const void *read16_map; + const void *read32_map; const void **write8_tab; const void **write16_tab; + const void **write32_tab; // drc stuff - int drc_tmp; // 70 + int drc_tmp; int irq_cycles; void *p_bios; // convenience pointers void *p_da; - void *p_sdram; // 80 + void *p_sdram; void *p_rom; + void *p_dram; + void *p_drcblk_da; + void *p_drcblk_ram; unsigned int pdb_io_csum[2]; #define SH2_STATE_RUN (1 << 0) // to prevent recursion -#define SH2_STATE_SLEEP (1 << 1) +#define SH2_STATE_SLEEP (1 << 1) // temporarily stopped (DMA, IO, ...) #define SH2_STATE_CPOLL (1 << 2) // polling comm regs #define SH2_STATE_VPOLL (1 << 3) // polling VDP +#define SH2_STATE_RPOLL (1 << 4) // polling address in SDRAM +#define SH2_TIMER_RUN (1 << 7) // SOC WDT timer is running +#define SH2_IN_DRC (1 << 8) // DRC in use unsigned int state; - unsigned int poll_addr; + uint32_t poll_addr; int poll_cycles; int poll_cnt; + // DRC branch cache. size must be 2^n and <=128 + int rts_cache_idx; + struct { uint32_t pc; void *code; } rts_cache[16]; + struct { uint32_t pc; void *code; } branch_cache[128]; + // interpreter stuff int icount; // cycles left in current timeslice unsigned int ea; @@ -60,21 +77,22 @@ unsigned int cycles_timeslice; struct SH2_ *other_sh2; + int (*run)(struct SH2_ *, int); // we use 68k reference cycles for easier sync unsigned int m68krcycles_done; unsigned int mult_m68k_to_sh2; unsigned int mult_sh2_to_m68k; - unsigned char data_array[0x1000]; // cache (can be used as RAM) - unsigned int peri_regs[0x200/4]; // periphereal regs + uint8_t data_array[0x1000]; // cache (can be used as RAM) + uint32_t peri_regs[0x200/4]; // peripheral regs } SH2; #define CYCLE_MULT_SHIFT 10 #define C_M68K_TO_SH2(xsh2, c) \ - ((int)((c) * (xsh2).mult_m68k_to_sh2) >> CYCLE_MULT_SHIFT) + (int)(((uint64_t)(c) * (xsh2)->mult_m68k_to_sh2) >> CYCLE_MULT_SHIFT) #define C_SH2_TO_M68K(xsh2, c) \ - ((int)((c + 3) * (xsh2).mult_sh2_to_m68k) >> CYCLE_MULT_SHIFT) + (int)(((uint64_t)(c+3U) * (xsh2)->mult_sh2_to_m68k) >> CYCLE_MULT_SHIFT) int sh2_init(SH2 *sh2, int is_slave, SH2 *other_sh2); void sh2_finish(SH2 *sh2); @@ -88,17 +106,21 @@ int sh2_execute_drc(SH2 *sh2c, int cycles); int sh2_execute_interpreter(SH2 *sh2c, int cycles); -static __inline int sh2_execute(SH2 *sh2, int cycles, int use_drc) +static __inline void sh2_execute_prepare(SH2 *sh2, int use_drc) +{ +#ifdef DRC_SH2 + sh2->run = use_drc ? sh2_execute_drc : sh2_execute_interpreter; +#else + sh2->run = sh2_execute_interpreter; +#endif +} + +static __inline int sh2_execute(SH2 *sh2, int cycles) { int ret; sh2->cycles_timeslice = cycles; -#ifdef DRC_SH2 - if (use_drc) - ret = sh2_execute_drc(sh2, cycles); - else -#endif - ret = sh2_execute_interpreter(sh2, cycles); + ret = sh2->run(sh2, cycles); return sh2->cycles_timeslice - ret; } @@ -108,12 +130,12 @@ // pico memhandlers // XXX: move somewhere else -unsigned int REGPARM(2) p32x_sh2_read8(unsigned int a, SH2 *sh2); -unsigned int REGPARM(2) p32x_sh2_read16(unsigned int a, SH2 *sh2); -unsigned int REGPARM(2) p32x_sh2_read32(unsigned int a, SH2 *sh2); -void REGPARM(3) p32x_sh2_write8 (unsigned int a, unsigned int d, SH2 *sh2); -void REGPARM(3) p32x_sh2_write16(unsigned int a, unsigned int d, SH2 *sh2); -void REGPARM(3) p32x_sh2_write32(unsigned int a, unsigned int d, SH2 *sh2); +u32 REGPARM(2) p32x_sh2_read8(u32 a, SH2 *sh2); +u32 REGPARM(2) p32x_sh2_read16(u32 a, SH2 *sh2); +u32 REGPARM(2) p32x_sh2_read32(u32 a, SH2 *sh2); +void REGPARM(3) p32x_sh2_write8 (u32 a, u32 d, SH2 *sh2); +void REGPARM(3) p32x_sh2_write16(u32 a, u32 d, SH2 *sh2); +void REGPARM(3) p32x_sh2_write32(u32 a, u32 d, SH2 *sh2); // debug #ifdef DRC_CMP
View file
libretro-picodrive-0~git20200112.tar.xz/jni/Android.mk -> libretro-picodrive-0~git20200716.tar.xz/jni/Android.mk
Changed
@@ -18,7 +18,6 @@ use_drz80 := 0 use_cz80 := 1 use_sh2drc := 0 -use_sh2mame := 1 use_svpdrc := 0 asm_memory := 0 @@ -27,16 +26,25 @@ asm_misc := 0 asm_cdmemory := 0 asm_mix := 0 +asm_32xdraw := 0 +asm_32xmemory := 0 ifeq ($(TARGET_ARCH),arm) - use_cyclone := 1 - use_sh2drc := 1 - use_svpdrc := 1 - asm_render := 1 - asm_misc := 1 - asm_mix := 1 - use_fame := 0 - use_sh2mame := 0 +# use_cyclone := 1 +# use_fame := 0 +# use_drz80 := 1 +# use_cz80 := 0 + use_sh2drc := 1 +# use_svpdrc := 1 + +# asm_memory := 1 +# asm_render := 1 +# asm_ym2612 := 1 +# asm_misc := 1 +# asm_cdmemory := 1 +# asm_mix := 1 +# asm_32xdraw := 1 +# asm_32xmemory := 1 endif ifeq ($(TARGET_ARCH_ABI),armeabi) @@ -47,16 +55,25 @@ SOURCES_C := $(LIBRETRO_DIR)/libretro.c \ $(COMMON_DIR)/mp3.c \ + $(COMMON_DIR)/mp3_sync.c \ $(COMMON_DIR)/mp3_dummy.c \ $(UNZIP_DIR)/unzip.c COREFLAGS := $(addprefix -D,$(DEFINES)) -fno-strict-aliasing -GIT_VERSION := " $(shell git rev-parse --short HEAD || echo unknown)" -ifneq ($(GIT_VERSION)," unknown") +GIT_VERSION := $(shell git rev-parse --short HEAD || echo unknown) +ifneq ($(GIT_VERSION),"unknown") COREFLAGS += -DGIT_VERSION=\"$(GIT_VERSION)\" endif +ifneq ($(filter armeabi%, $(TARGET_ARCH_ABI)),) +$(CORE_DIR)/pico/pico_int_offs.h: + cp $(CORE_DIR)/tools/offsets/generic-ilp32-offsets.h $@ +.PHONY: $(CORE_DIR)/pico/pico_int_offs.h + +$(filter %.S,$(SRCS_COMMON)): $(CORE_DIR)/pico/pico_int_offs.h +endif + include $(CLEAR_VARS) LOCAL_MODULE := retro LOCAL_SRC_FILES := $(SRCS_COMMON) $(SOURCES_C)
View file
libretro-picodrive-0~git20200112.tar.xz/pico/32x/32x.c -> libretro-picodrive-0~git20200716.tar.xz/pico/32x/32x.c
Changed
@@ -12,7 +12,7 @@ struct Pico32x Pico32x; SH2 sh2s[2]; -#define SH2_IDLE_STATES (SH2_STATE_CPOLL|SH2_STATE_VPOLL|SH2_STATE_SLEEP) +#define SH2_IDLE_STATES (SH2_STATE_CPOLL|SH2_STATE_VPOLL|SH2_STATE_RPOLL|SH2_STATE_SLEEP) static int REGPARM(2) sh2_irq_cb(SH2 *sh2, int level) { @@ -30,7 +30,7 @@ } // MUST specify active_sh2 when called from sh2 memhandlers -void p32x_update_irls(SH2 *active_sh2, int m68k_cycles) +void p32x_update_irls(SH2 *active_sh2, unsigned int m68k_cycles) { int irqs, mlvl = 0, slvl = 0; int mrun, srun; @@ -38,30 +38,32 @@ if (active_sh2 != NULL) m68k_cycles = sh2_cycles_done_m68k(active_sh2); + // find top bit = highest irq number (0 <= irl <= 14/2) by binary search + // msh2 irqs = Pico32x.sh2irqs | Pico32x.sh2irqi[0]; - while ((irqs >>= 1)) - mlvl++; - mlvl *= 2; + if (irqs >= 0x10) mlvl += 8, irqs >>= 4; + if (irqs >= 0x04) mlvl += 4, irqs >>= 2; + if (irqs >= 0x02) mlvl += 2, irqs >>= 1; // ssh2 irqs = Pico32x.sh2irqs | Pico32x.sh2irqi[1]; - while ((irqs >>= 1)) - slvl++; - slvl *= 2; + if (irqs >= 0x10) slvl += 8, irqs >>= 4; + if (irqs >= 0x04) slvl += 4, irqs >>= 2; + if (irqs >= 0x02) slvl += 2, irqs >>= 1; - mrun = sh2_irl_irq(&msh2, mlvl, active_sh2 == &msh2); + mrun = sh2_irl_irq(&msh2, mlvl, msh2.state & SH2_STATE_RUN); if (mrun) { p32x_sh2_poll_event(&msh2, SH2_IDLE_STATES, m68k_cycles); - if (active_sh2 == &msh2) - sh2_end_run(active_sh2, 1); + if (msh2.state & SH2_STATE_RUN) + sh2_end_run(&msh2, 1); } - srun = sh2_irl_irq(&ssh2, slvl, active_sh2 == &ssh2); + srun = sh2_irl_irq(&ssh2, slvl, ssh2.state & SH2_STATE_RUN); if (srun) { p32x_sh2_poll_event(&ssh2, SH2_IDLE_STATES, m68k_cycles); - if (active_sh2 == &ssh2) - sh2_end_run(active_sh2, 1); + if (ssh2.state & SH2_STATE_RUN) + sh2_end_run(&ssh2, 1); } elprintf(EL_32X, "update_irls: m %d/%d, s %d/%d", mlvl, mrun, slvl, srun); @@ -70,7 +72,7 @@ // the mask register is inconsistent, CMD is supposed to be a mask, // while others are actually irq trigger enables? // TODO: test on hw.. -void p32x_trigger_irq(SH2 *sh2, int m68k_cycles, unsigned int mask) +void p32x_trigger_irq(SH2 *sh2, unsigned int m68k_cycles, unsigned int mask) { Pico32x.sh2irqs |= mask & P32XI_VRES; Pico32x.sh2irqi[0] |= mask & (Pico32x.sh2irq_mask[0] << 3); @@ -79,7 +81,7 @@ p32x_update_irls(sh2, m68k_cycles); } -void p32x_update_cmd_irq(SH2 *sh2, int m68k_cycles) +void p32x_update_cmd_irq(SH2 *sh2, unsigned int m68k_cycles) { if ((Pico32x.sh2irq_mask[0] & 2) && (Pico32x.regs[2 / 2] & 1)) Pico32x.sh2irqi[0] |= P32XI_CMD; @@ -194,11 +196,11 @@ void PicoUnload32x(void) { + sh2_finish(&msh2); + sh2_finish(&ssh2); if (Pico32xMem != NULL) plat_munmap(Pico32xMem, sizeof(*Pico32xMem)); Pico32xMem = NULL; - sh2_finish(&msh2); - sh2_finish(&ssh2); PicoIn.AHW &= ~PAHW_32X; } @@ -207,8 +209,8 @@ { if (PicoIn.AHW & PAHW_32X) { p32x_trigger_irq(NULL, SekCyclesDone(), P32XI_VRES); - p32x_sh2_poll_event(&msh2, SH2_IDLE_STATES, 0); - p32x_sh2_poll_event(&ssh2, SH2_IDLE_STATES, 0); + p32x_sh2_poll_event(&msh2, SH2_IDLE_STATES, SekCyclesDone()); + p32x_sh2_poll_event(&ssh2, SH2_IDLE_STATES, SekCyclesDone()); p32x_pwm_ctl_changed(); p32x_timers_recalc(); } @@ -254,11 +256,11 @@ } p32x_trigger_irq(NULL, SekCyclesDone(), P32XI_VINT); - p32x_sh2_poll_event(&msh2, SH2_STATE_VPOLL, 0); - p32x_sh2_poll_event(&ssh2, SH2_STATE_VPOLL, 0); + p32x_sh2_poll_event(&msh2, SH2_STATE_VPOLL, SekCyclesDone()); + p32x_sh2_poll_event(&ssh2, SH2_STATE_VPOLL, SekCyclesDone()); } -void p32x_schedule_hint(SH2 *sh2, int m68k_cycles) +void p32x_schedule_hint(SH2 *sh2, unsigned int m68k_cycles) { // rather rough, 32x hint is useless in practice int after; @@ -267,7 +269,8 @@ return; // nobody cares // note: when Pico.m.scanline is 224, SH2s might // still be at scanline 93 (or so) - if (!(Pico32x.sh2_regs[0] & 0x80) && Pico.m.scanline > 224) + if (!(Pico32x.sh2_regs[0] & 0x80) && + Pico.m.scanline > (Pico.video.reg[1] & 0x08 ? 240 : 224)) return; after = (Pico32x.sh2_regs[4 / 2] + 1) * 488; @@ -323,8 +326,12 @@ p32x_event_schedule(now, event, after); - left_to_next = (event_time_next - now) * 3; - sh2_end_run(sh2, left_to_next); + left_to_next = C_M68K_TO_SH2(sh2, (int)(event_time_next - now)); + if (sh2_cycles_left(sh2) > left_to_next) { + if (left_to_next < 1) + left_to_next = 1; + sh2_end_run(sh2, left_to_next); + } } static void p32x_run_events(unsigned int until) @@ -366,19 +373,19 @@ oldest, event_time_next); } -static void run_sh2(SH2 *sh2, int m68k_cycles) +static void run_sh2(SH2 *sh2, unsigned int m68k_cycles) { - int cycles, done; + unsigned int cycles, done; pevt_log_sh2_o(sh2, EVT_RUN_START); sh2->state |= SH2_STATE_RUN; - cycles = C_M68K_TO_SH2(*sh2, m68k_cycles); + cycles = C_M68K_TO_SH2(sh2, m68k_cycles); elprintf_sh2(sh2, EL_32X, "+run %u %d @%08x", sh2->m68krcycles_done, cycles, sh2->pc); - done = sh2_execute(sh2, cycles, PicoIn.opt & POPT_EN_DRC); + done = sh2_execute(sh2, cycles); - sh2->m68krcycles_done += C_SH2_TO_M68K(*sh2, done); + sh2->m68krcycles_done += C_SH2_TO_M68K(sh2, done); sh2->state &= ~SH2_STATE_RUN; pevt_log_sh2_o(sh2, EVT_RUN_END); elprintf_sh2(sh2, EL_32X, "-run %u %d", @@ -412,8 +419,7 @@ // there might be new event to schedule current sh2 to if (event_time_next) { - left_to_event = event_time_next - m68k_target; - left_to_event *= 3; + left_to_event = C_M68K_TO_SH2(sh2, (int)(event_time_next - m68k_target)); if (sh2_cycles_left(sh2) > left_to_event) { if (left_to_event < 1) left_to_event = 1; @@ -423,7 +429,7 @@ } #define STEP_LS 24 -#define STEP_N 440 +#define STEP_N 528 // at least one line (488) #define sync_sh2s_normal p32x_sync_sh2s //#define sync_sh2s_lockstep p32x_sync_sh2s @@ -431,7 +437,7 @@ /* most timing is in 68k clock */ void sync_sh2s_normal(unsigned int m68k_target) { - unsigned int now, target, timer_cycles; + unsigned int now, target, next, timer_cycles; int cycles; elprintf(EL_32X, "sh2 sync to %u", m68k_target); @@ -446,6 +452,7 @@
View file
libretro-picodrive-0~git20200112.tar.xz/pico/32x/draw.c -> libretro-picodrive-0~git20200716.tar.xz/pico/32x/draw.c
Changed
@@ -1,6 +1,7 @@ /* * PicoDrive * (C) notaz, 2009,2010 + * (C) kub, 2019 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -11,6 +12,9 @@ int (*PicoScan32xEnd)(unsigned int num); int Pico32xDrawMode; +void *DrawLineDestBase32x; +int DrawLineDestIncrement32x; + static void convert_pal555(int invert_prio) { unsigned int *ps = (void *)Pico32xMem->pal; @@ -44,16 +48,21 @@ const unsigned int m1 = 0x001f; \ const unsigned int m2 = 0x03e0; \ const unsigned int m3 = 0x7c00; \ - int i; \ + unsigned short t; \ + int i = 320; \ \ - for (i = 320; i > 0; i--, pd++, p32x++, pmd++) { \ - unsigned short t = *p32x; \ - if ((*pmd & 0x3f) != mdbg && !((t ^ inv) & 0x8000)) { \ - pmd_draw_code; \ - continue; \ + while (i > 0) { \ + for (; i > 0 && (*pmd & 0x3f) == mdbg; pd++, pmd++, i--) { \ + t = *p32x++; \ + *pd = ((t&m1) << 11) | ((t&m2) << 1) | ((t&m3) >> 10); \ + } \ + for (; i > 0 && (*pmd & 0x3f) != mdbg; pd++, pmd++, i--) { \ + t = *p32x++; \ + if ((t ^ inv) & 0x8000) \ + *pd = ((t&m1) << 11) | ((t&m2) << 1) | ((t&m3) >> 10); \ + else \ + pmd_draw_code; \ } \ - \ - *pd = ((t & m1) << 11) | ((t & m2) << 1) | ((t & m3) >> 10); \ } \ } @@ -61,15 +70,21 @@ #define do_line_pp(pd, p32x, pmd, pmd_draw_code) \ { \ unsigned short t; \ - int i; \ - for (i = 320; i > 0; i--, pd++, p32x++, pmd++) { \ - t = pal[*(unsigned char *)((uintptr_t)p32x ^ 1)]; \ - if ((t & 0x20) || (*pmd & 0x3f) == mdbg) \ + int i = 320; \ + while (i > 0) { \ + for (; i > 0 && (*pmd & 0x3f) == mdbg; pd++, pmd++, i--) { \ + t = pal[*(unsigned char *)((uintptr_t)(p32x++) ^ 1)]; \ *pd = t; \ - else \ - pmd_draw_code; \ + } \ + for (; i > 0 && (*pmd & 0x3f) != mdbg; pd++, pmd++, i--) { \ + t = pal[*(unsigned char *)((uintptr_t)(p32x++) ^ 1)]; \ + if (t & 0x20) \ + *pd = t; \ + else \ + pmd_draw_code; \ + } \ } \ -} +} // run length mode #define do_line_rl(pd, p32x, pmd, pmd_draw_code) \ @@ -233,13 +248,11 @@ int lines_sft_offs; int which_func; - Pico.est.DrawLineDest = (char *)DrawLineDestBase + offs * DrawLineDestIncrement; + Pico.est.DrawLineDest = (char *)DrawLineDestBase32x + offs * DrawLineDestIncrement32x; dram = Pico32xMem->dram[Pico32x.vdp_regs[0x0a/2] & P32XV_FS]; - if (Pico32xDrawMode == PDM32X_BOTH) { - if (Pico.m.dirtyPal) - PicoDrawUpdateHighPal(); - } + if (Pico32xDrawMode == PDM32X_BOTH) + PicoDrawUpdateHighPal(); if ((Pico32x.vdp_regs[0] & P32XV_Mx) == 2) { @@ -278,20 +291,21 @@ void PicoDraw32xLayerMdOnly(int offs, int lines) { int have_scan = PicoScan32xBegin != NULL && PicoScan32xEnd != NULL; - unsigned short *dst = (void *)((char *)DrawLineDestBase + offs * DrawLineDestIncrement); + unsigned short *dst = (void *)((char *)DrawLineDestBase32x + offs * DrawLineDestIncrement32x); unsigned char *pmd = Pico.est.Draw2FB + 328 * offs + 8; unsigned short *pal = Pico.est.HighPal; int poffs = 0, plen = 320; int l, p; if (!(Pico.video.reg[12] & 1)) { - // 32col mode + // 32col mode. for some render modes MD pixel data carries an offset + if (!(PicoIn.opt & (POPT_ALT_RENDERER|POPT_DIS_32C_BORDER))) + pmd += 32; poffs = 32; plen = 256; } - if (Pico.m.dirtyPal) - PicoDrawUpdateHighPal(); + PicoDrawUpdateHighPal(); dst += poffs; for (l = 0; l < lines; l++) { @@ -305,7 +319,7 @@ dst[p + 2] = pal[*pmd++]; dst[p + 3] = pal[*pmd++]; } - dst = (void *)((char *)dst + DrawLineDestIncrement); + dst = (void *)((char *)dst + DrawLineDestIncrement32x); pmd += 328 - plen; if (have_scan) PicoScan32xEnd(l + offs); @@ -314,21 +328,32 @@ void PicoDrawSetOutFormat32x(pdso_t which, int use_32x_line_mode) { -#ifdef _ASM_32X_DRAW - extern void *Pico32xNativePal; - Pico32xNativePal = Pico32xMem->pal_native; -#endif + if (which == PDF_RGB555) { + // need CLUT pixels in PicoDraw2FB for layer transparency + PicoDrawSetInternalBuf(Pico.est.Draw2FB, 328); + PicoDrawSetOutBufMD(DrawLineDestBase32x, DrawLineDestIncrement32x); + } else { + // use the same layout as alt renderer + PicoDrawSetInternalBuf(NULL, 0); + PicoDrawSetOutBufMD(Pico.est.Draw2FB + 8, 328); + } - if (which == PDF_RGB555 && use_32x_line_mode) { + if (use_32x_line_mode) // we'll draw via FinalizeLine32xRGB555 (rare) - PicoDrawSetInternalBuf(NULL, 0); Pico32xDrawMode = PDM32X_OFF; - return; - } + else + // in RGB555 mode the 32x layer is drawn over the MD layer, in the other + // modes 32x and MD layer are merged together by the 32x renderer + Pico32xDrawMode = (which == PDF_RGB555) ? PDM32X_32X_ONLY : PDM32X_BOTH; +} - // use the same layout as alt renderer - PicoDrawSetInternalBuf(Pico.est.Draw2FB, 328); - Pico32xDrawMode = (which == PDF_RGB555) ? PDM32X_32X_ONLY : PDM32X_BOTH; +void PicoDrawSetOutBuf32X(void *dest, int increment) +{ + DrawLineDestBase32x = dest; + DrawLineDestIncrement32x = increment; + // in RGB555 mode this buffer is also used by the MD renderer + if (Pico32xDrawMode != PDM32X_BOTH) + PicoDrawSetOutBufMD(DrawLineDestBase32x, DrawLineDestIncrement32x); } // vim:shiftwidth=2:ts=2:expandtab
View file
libretro-picodrive-0~git20200716.tar.xz/pico/32x/draw_arm.S
Added
@@ -0,0 +1,444 @@ +@* +@* PicoDrive +@* (C) notaz, 2010 +@* (C) kub, 2019 +@* +@* This work is licensed under the terms of MAME license. +@* See COPYING file in the top-level directory. +@* + +#include "pico/arm_features.h" +#include "pico/pico_int_offs.h" + +.extern Pico32x +.extern Pico + +.equiv P32XV_PRI, (1<< 7) + +.text +.align 2 + + PIC_LDR_INIT() + +.macro call_scan_prep cond est @ &Pico.est +.if \cond + PIC_LDR(r4, r6, PicoScan32xBegin) + PIC_LDR(r5, r6, PicoScan32xEnd) + ldr r6, [\est, #OFS_EST_DrawLineDest] + ldr r4, [r4] + ldr r5, [r5] + stmfd sp!, {r4,r5,r6} +.endif +.endm + +.macro call_scan_fin_ge cond +.if \cond + addge sp, sp, #4*3 +.endif +.endm + +.macro call_scan_begin cond +.if \cond + stmfd sp!, {r1-r3} + and r0, r2, #0xff + add r0, r0, r4 + mov lr, pc + ldr pc, [sp, #(3+0)*4] + ldr r0, [sp, #(3+2)*4] @ &DrawLineDest + ldmfd sp!, {r1-r3} + ldr r0, [r0] +.endif +.endm + +.macro call_scan_end cond +.if \cond + stmfd sp!, {r0-r3} + and r0, r2, #0xff + add r0, r0, r4 + mov lr, pc + ldr pc, [sp, #(4+1)*4] + ldmfd sp!, {r0-r3} +.endif +.endm + +@ direct color +@ unsigned short *dst, unsigned short *dram, int lines_sft_offs, int mdbg +.macro make_do_loop_dc name call_scan do_md +.global \name +\name: + stmfd sp!, {r4-r11,lr} + + PIC_LDR(lr, r9, Pico) + PIC_LDR(r10,r9, Pico32x) + ldr r11, [lr, #OFS_Pico_est+OFS_EST_Draw2FB] + ldrh r10,[r10, #0x40] @ Pico32x.vdp_regs[0] + add r9, lr, #OFS_Pico_est+OFS_EST_HighPal @ palmd + + and r4, r2, #0xff + mov r5, #328 + mov r3, r3, lsl #26 @ mdbg << 26 + mla r11,r4,r5,r11 @ r11 = pmd = PicoDraw2FB + offs*328: md data + tst r10,#P32XV_PRI + movne r10,#0 + moveq r10,#0x8000 @ r10 = inv_bit + call_scan_prep \call_scan lr + + mov r4, #0 @ line + b 1f @ loop_outer_entry + +0: @ loop_outer: + call_scan_end \call_scan + add r4, r4, #1 + cmp r4, r2, lsr #16 + call_scan_fin_ge \call_scan + ldmgefd sp!, {r4-r11,pc} + +1: @ loop_outer_entry: + call_scan_begin \call_scan + mov r12,r4, lsl #1 + ldrh r12,[r1, r12] + add r11,r11,#8 + mov r6, #320 + add r5, r1, r12, lsl #1 @ p32x = dram + dram[l] + +2: @ loop_inner: + ldrh r8, [r5], #2 + subs lr, r6, #1 + blt 0b @ loop_outer + +3: @ loop_innermost: + ldrh r7, [r5], #2 @ 32x pixel + subs lr, lr, #1 + cmpge r7, r8 + beq 3b @ loop_innermost + + sub r5, r5, #2 + add lr, lr, #1 + sub lr, r6, lr + sub r6, r6, lr + + eor r12,r8, r10 + tst r12, #0x8000 @ !((t ^ inv) & 0x8000) + bne 5f @ draw_md + + and r7 ,r8, #0x03e0 + mov r8, r8, lsl #11 + orr r8, r8, r8, lsr #(10+11) + orr r8, r8, r7 ,lsl #1 + bic r8, r8, #0x0020 @ kill prio bit + + add r11,r11,lr + tst r0, #2 @ dst unaligned? + strneh r8, [r0], #2 + subne lr, lr, #1 + cmp lr, #0 + beq 2b @ loop_inner + mov r8, r8, lsl #16 + orr r12,r8, r8, lsr #16 + mov r8 ,r12 +4: @ draw_32x: + subs lr, lr, #4 @ store 4 pixels + stmgeia r0!, {r8, r12} + bgt 4b @ draw_32x + beq 2b @ loop_inner + adds lr, lr, #2 @ store 1-3 leftover pixels + strge r8, [r0], #4 + strneh r8, [r0], #2 + b 2b @ loop_inner + +5: @ draw_md: + subs lr, lr, #1 + ldrgeb r7, [r11], #1 @ MD pixel + blt 2b @ loop_inner + cmp r3, r7, lsl #26 @ MD has bg pixel? +.if \do_md + mov r7, r7, lsl #1 + ldrneh r7 ,[r9, r7] + strneh r7 ,[r0], #2 @ *dst++ = palmd[*pmd] +.else + addne r0, r0, #2 +.endif + bne 5b @ draw_md + + and r7 ,r8, #0x03e0 + mov r8, r8, lsl #11 + orr r8, r8, r8, lsr #(10+11) + orr r8, r8, r7 ,lsl #1 + bic r8, r8, #0x0020 @ kill prio bit + strh r8, [r0], #2 @ *dst++ = bgr2rgb(*p32x++) + +6: @ draw_md_32x: + subs lr, lr, #1 + ldrgeb r7, [r11], #1 @ MD pixel + blt 2b @ loop_inner + cmp r3, r7, lsl #26 @ MD has bg pixel? +.if \do_md + mov r7, r7, lsl #1 + ldrneh r7 ,[r9, r7] @ *dst++ = palmd[*pmd] + moveq r7 ,r8 @ *dst++ = bgr2rgb(*p32x++) + strh r7 ,[r0], #2 +.else + streqh r8, [r0] @ *dst++ = bgr2rgb(*p32x++) + add r0, r0, #2 +.endif + b 6b @ draw_md_32x +.endm + + +@ packed pixel +@ note: this may read a few bytes over the end of PicoDraw2FB and dram, +@ so those should have a bit more alloc'ed than really needed. +@ unsigned short *dst, unsigned short *dram, int lines_sft_offs, int mdbg +.macro make_do_loop_pp name call_scan do_md +.global \name +\name: + stmfd sp!, {r4-r11,lr} + + PIC_LDR(lr, r9, Pico) + PIC_LDR(r10,r9, Pico32xMem) + ldr r9,=OFS_PMEM32x_pal_native
View file
libretro-picodrive-0~git20200112.tar.xz/pico/32x/memory.c -> libretro-picodrive-0~git20200716.tar.xz/pico/32x/memory.c
Changed
@@ -1,6 +1,7 @@ /* * PicoDrive * (C) notaz, 2009,2010,2013 + * (C) kub, 2019 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -40,7 +41,9 @@ */ #include "../pico_int.h" #include "../memory.h" + #include "../../cpu/sh2/compiler.h" +DRC_DECLARE_SR; static const char str_mars[] = "MARS"; @@ -56,32 +59,40 @@ #define REG8IN16(ptr, offs) ((u8 *)ptr)[(offs) ^ 1] // poll detection -#define POLL_THRESHOLD 3 +#define POLL_THRESHOLD 5 static struct { - u32 addr, cycles; + u32 addr1, addr2, cycles; int cnt; } m68k_poll; static int m68k_poll_detect(u32 a, u32 cycles, u32 flags) { int ret = 0; + // support polling on 2 addresses - seen in Wolfenstein + int match = (a - m68k_poll.addr1 <= 2 || a - m68k_poll.addr2 <= 2); - if (a - 2 <= m68k_poll.addr && m68k_poll.addr <= a + 2 - && cycles - m68k_poll.cycles <= 64 && !SekNotPolling) + if (match && cycles - m68k_poll.cycles <= 64 && !SekNotPolling) { - if (m68k_poll.cnt++ > POLL_THRESHOLD) { + // detect split 32bit access by same cycle count, and ignore those + if (cycles != m68k_poll.cycles && ++m68k_poll.cnt >= POLL_THRESHOLD) { if (!(Pico32x.emu_flags & flags)) { elprintf(EL_32X, "m68k poll addr %08x, cyc %u", a, cycles - m68k_poll.cycles); - ret = 1; } Pico32x.emu_flags |= flags; + ret = 1; } } else { + // reset poll state in case of restart by interrupt + Pico32x.emu_flags &= ~(P32XF_68KCPOLL|P32XF_68KVPOLL); + SekSetStop(0); m68k_poll.cnt = 0; - m68k_poll.addr = a; + if (!match) { + m68k_poll.addr2 = m68k_poll.addr1; + m68k_poll.addr1 = a; + } SekNotPolling = 0; } m68k_poll.cycles = cycles; @@ -97,15 +108,19 @@ Pico32x.emu_flags &= ~flags; SekSetStop(0); } - m68k_poll.addr = m68k_poll.cnt = 0; + m68k_poll.addr1 = m68k_poll.addr2 = m68k_poll.cnt = 0; } -static void sh2_poll_detect(SH2 *sh2, u32 a, u32 flags, int maxcnt) +void NOINLINE p32x_sh2_poll_detect(u32 a, SH2 *sh2, u32 flags, int maxcnt) { - int cycles_left = sh2_cycles_left(sh2); + u32 cycles_done = sh2_cycles_done_t(sh2); + u32 cycles_diff = cycles_done - sh2->poll_cycles; - if (a == sh2->poll_addr && sh2->poll_cycles - cycles_left <= 10) { - if (sh2->poll_cnt++ > maxcnt) { + // reading 2 consecutive 16bit values is probably a 32bit access. detect this + // by checking address (max 2 bytes away) and cycles (max 2 cycles later). + // no polling if more than 20 cycles have passed since last detect call. + if (a - sh2->poll_addr <= 2 && CYCLES_GE(20, cycles_diff)) { + if (CYCLES_GT(cycles_diff, 2) && ++sh2->poll_cnt >= maxcnt) { if (!(sh2->state & flags)) elprintf_sh2(sh2, EL_32X, "state: %02x->%02x", sh2->state, sh2->state | flags); @@ -113,40 +128,179 @@ sh2->state |= flags; sh2_end_run(sh2, 1); pevt_log_sh2(sh2, EVT_POLL_START); - return; +#ifdef DRC_SH2 + // mark this as an address used for polling if SDRAM + if ((a & 0xc6000000) == 0x06000000) { + unsigned char *p = sh2->p_drcblk_ram; + p[(a & 0x3ffff) >> SH2_DRCBLK_RAM_SHIFT] |= 0x80; + // mark next word too to enable poll fifo for 32bit access + p[((a+2) & 0x3ffff) >> SH2_DRCBLK_RAM_SHIFT] |= 0x80; + } +#endif } } - else + else if (!(sh2->state & (SH2_STATE_CPOLL|SH2_STATE_VPOLL|SH2_STATE_RPOLL))) { sh2->poll_cnt = 0; - sh2->poll_addr = a; - sh2->poll_cycles = cycles_left; + sh2->poll_addr = a; + } + sh2->poll_cycles = cycles_done; } -void p32x_sh2_poll_event(SH2 *sh2, u32 flags, u32 m68k_cycles) +void NOINLINE p32x_sh2_poll_event(SH2 *sh2, u32 flags, u32 m68k_cycles) { if (sh2->state & flags) { elprintf_sh2(sh2, EL_32X, "state: %02x->%02x", sh2->state, sh2->state & ~flags); - if (sh2->m68krcycles_done < m68k_cycles) + if (sh2->m68krcycles_done < m68k_cycles && !(sh2->state & SH2_STATE_RUN)) sh2->m68krcycles_done = m68k_cycles; pevt_log_sh2_o(sh2, EVT_POLL_END); + sh2->state &= ~flags; } - sh2->state &= ~flags; - sh2->poll_addr = sh2->poll_cycles = sh2->poll_cnt = 0; + if (!(sh2->state & (SH2_STATE_CPOLL|SH2_STATE_VPOLL|SH2_STATE_RPOLL))) + sh2->poll_addr = sh2->poll_cycles = sh2->poll_cnt = 0; } -static void sh2s_sync_on_read(SH2 *sh2) +static void sh2s_sync_on_read(SH2 *sh2, unsigned cycles) { - int cycles; if (sh2->poll_cnt != 0) return; - cycles = sh2_cycles_done(sh2); - if (cycles > 600) - p32x_sync_other_sh2(sh2, sh2->m68krcycles_done + cycles / 3); + if (p32x_sh2_ready(sh2->other_sh2, cycles-250)) + p32x_sync_other_sh2(sh2, cycles); +} + +// poll fifo, stores writes to potential addresses used for polling. +// This is used to correctly deliver syncronisation data to the 3 cpus. The +// fifo stores 16 bit values, 8/32 bit accesses must be adapted accordingly. +#define PFIFO_SZ 4 +#define PFIFO_CNT 8 +struct sh2_poll_fifo { + u32 cycles; + u32 a; + u16 d; + int cpu; +} sh2_poll_fifo[PFIFO_CNT][PFIFO_SZ]; +unsigned sh2_poll_rd[PFIFO_CNT], sh2_poll_wr[PFIFO_CNT]; // ringbuffer pointers + +static NOINLINE u32 sh2_poll_read(u32 a, u32 d, unsigned int cycles, SH2* sh2) +{ + int hix = (a >> 1) % PFIFO_CNT; + struct sh2_poll_fifo *fifo = sh2_poll_fifo[hix]; + struct sh2_poll_fifo *p; + int cpu = sh2 ? sh2->is_slave : -1; + unsigned idx; + + a &= ~0x20000000; // ignore writethrough bit + // fetch oldest write to address from fifo, but stop when reaching the present + idx = sh2_poll_rd[hix]; + while (idx != sh2_poll_wr[hix] && CYCLES_GE(cycles, fifo[idx].cycles)) { + p = &fifo[idx]; + idx = (idx+1) % PFIFO_SZ; + + if (cpu != p->cpu) { + if (CYCLES_GT(cycles, p->cycles+80)) { + // drop older fifo stores that may cause synchronisation problems. + p->a = -1; + } else if (p->a == a) { + // replace current data with fifo value and discard fifo entry + d = p->d; + p->a = -1; + break; + } + } + } + return d; +} + +static NOINLINE void sh2_poll_write(u32 a, u32 d, unsigned int cycles, SH2 *sh2) +{ + int hix = (a >> 1) % PFIFO_CNT; + struct sh2_poll_fifo *fifo = sh2_poll_fifo[hix]; + struct sh2_poll_fifo *q;
View file
libretro-picodrive-0~git20200716.tar.xz/pico/32x/memory_arm.S
Added
@@ -0,0 +1,288 @@ +/* + * PicoDrive 32X memory access functions, assembler version + * (C) KUB, 2018 + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ + +#include "../pico_int_offs.h" + +@ 32X bank sizes... TODO this should somehow come from an include file +.equ SH2_ROM_SHIFT, 10 @ 0x003fffff +.equ SH2_RAM_SHIFT, 14 @ 0x0003ffff +.equ SH2_DRAM_SHIFT,15 @ 0x0001ffff +.equ SH2_DA_SHIFT, 20 @ 0x00000fff + +.equ SH2_DRAM_OW, 1<<(32-SH2_DRAM_SHIFT) @ DRAM overwrite mode bit + +.text +.align 5 + +#if 0 +@ u32 a, SH2 *sh2 +.global sh2_read8_rom +.global sh2_read8_sdram +.global sh2_read8_da +.global sh2_read8_dram +.global sh2_read16_rom +.global sh2_read16_sdram +.global sh2_read16_da +.global sh2_read16_dram +.global sh2_read32_rom +.global sh2_read32_sdram +.global sh2_read32_da +.global sh2_read32_dram +#endif + +@ u32 a, u32 d, SH2 *sh2 +.global sh2_write8_sdram +.global sh2_write8_da +.global sh2_write8_dram +.global sh2_write16_sdram +.global sh2_write16_da +.global sh2_write16_dram +.global sh2_write32_sdram +.global sh2_write32_da +.global sh2_write32_dram + +#if 0 +sh2_read8_rom: + ldr ip, [r1, #OFS_SH2_p_rom] + eor r0, r0, #1 + mov r0, r0, lsl #SH2_ROM_SHIFT + ldrb r0, [ip, r0, lsr #SH2_ROM_SHIFT] + bx lr + +sh2_read8_sdram: + ldr ip, [r1, #OFS_SH2_p_sdram] + eor r0, r0, #1 + mov r0, r0, lsl #SH2_RAM_SHIFT + ldrb r0, [ip, r0, lsr #SH2_RAM_SHIFT] + bx lr + +sh2_read8_da: + ldr ip, [r1, #OFS_SH2_p_da] + eor r0, r0, #1 + mov r0, r0, lsl #SH2_DA_SHIFT + ldrb r0, [ip, r0, lsr #SH2_DA_SHIFT] + bx lr + +sh2_read8_dram: + ldr ip, [r1, #OFS_SH2_p_dram] + eor r0, r0, #1 + mov r0, r0, lsl #SH2_DRAM_SHIFT + ldrb r0, [ip, r0, lsr #SH2_DRAM_SHIFT] + bx lr + +sh2_read16_rom: + ldr ip, [r1, #OFS_SH2_p_rom] + mov r0, r0, lsl #SH2_ROM_SHIFT + mov r0, r0, lsr #SH2_ROM_SHIFT + ldrh r0, [ip, r0] + bx lr + +sh2_read16_sdram: + ldr ip, [r1, #OFS_SH2_p_sdram] + mov r0, r0, lsl #SH2_RAM_SHIFT + mov r0, r0, lsr #SH2_RAM_SHIFT + ldrh r0, [ip, r0] + bx lr + +sh2_read16_da: + ldr ip, [r1, #OFS_SH2_p_da] + mov r0, r0, lsl #SH2_DA_SHIFT + mov r0, r0, lsr #SH2_DA_SHIFT + ldrh r0, [ip, r0] + bx lr + +sh2_read16_dram: + ldr ip, [r1, #OFS_SH2_p_dram] + mov r0, r0, lsl #SH2_DRAM_SHIFT + mov r0, r0, lsr #SH2_DRAM_SHIFT + ldrh r0, [ip, r0] + bx lr + +sh2_read32_rom: + ldr ip, [r1, #OFS_SH2_p_rom] + mov r0, r0, lsl #SH2_ROM_SHIFT + ldr r0, [ip, r0, lsr #SH2_ROM_SHIFT] + mov r0, r0, ror #16 + bx lr + +sh2_read32_sdram: + ldr ip, [r1, #OFS_SH2_p_sdram] + mov r0, r0, lsl #SH2_RAM_SHIFT + ldr r0, [ip, r0, lsr #SH2_RAM_SHIFT] + mov r0, r0, ror #16 + bx lr + +sh2_read32_da: + ldr ip, [r1, #OFS_SH2_p_da] + mov r0, r0, lsl #SH2_DA_SHIFT + ldr r0, [ip, r0, lsr #SH2_DA_SHIFT] + mov r0, r0, ror #16 + bx lr + +sh2_read32_dram: + ldr ip, [r1, #OFS_SH2_p_dram] + mov r0, r0, lsl #SH2_DRAM_SHIFT + ldr r0, [ip, r0, lsr #SH2_DRAM_SHIFT] + mov r0, r0, ror #16 + bx lr +#endif + +sh2_write8_sdram: + @ preserve r0-r2 for tail call + ldr ip, [r2, #OFS_SH2_p_sdram] + eor r3, r0, #1 + mov r3, r3, lsl #SH2_RAM_SHIFT + strb r1, [ip, r3, lsr #SH2_RAM_SHIFT] +#ifdef DRC_SH2 + ldr r1, [r2, #OFS_SH2_p_drcblk_ram] + ldrb r3, [r1, r3, lsr #SH2_RAM_SHIFT+1] + cmp r3, #0 + bxeq lr + @ need to load aligned 16 bit data for check + bic r0, r0, #1 + mov r1, r0, lsl #SH2_RAM_SHIFT + mov r1, r1, lsr #SH2_RAM_SHIFT + ldrh r1, [ip, r1] + b sh2_sdram_checks +#else + bx lr +#endif + +sh2_write8_da: + @ preserve r0 and r2 for tail call + ldr ip, [r2, #OFS_SH2_p_da] + eor r3, r0, #1 + mov r3, r3, lsl #SH2_DA_SHIFT + strb r1, [ip, r3, lsr #SH2_DA_SHIFT] +#ifdef DRC_SH2 + ldr ip, [r2, #OFS_SH2_p_drcblk_da] + ldrb r1, [ip, r3, lsr #SH2_DA_SHIFT+1] + bic r0, r0, #1 + cmp r1, #0 + bxeq lr + mov r1, #2 + b sh2_drc_wcheck_da +#else + bx lr +#endif + +sh2_write8_dram: + tst r1, #0xff + ldrne ip, [r2, #OFS_SH2_p_dram] + eorne r3, r0, #1 + movne r3, r3, lsl #SH2_DRAM_SHIFT + strneb r1, [ip, r3, lsr #SH2_DRAM_SHIFT] + bx lr + +sh2_write16_sdram: + @ preserve r0-r2 for tail call + ldr ip, [r2, #OFS_SH2_p_sdram] + mov r3, r0, lsl #SH2_RAM_SHIFT + mov r3, r3, lsr #SH2_RAM_SHIFT + strh r1, [ip, r3] +#ifdef DRC_SH2 + ldr ip, [r2, #OFS_SH2_p_drcblk_ram] + ldrb r3, [ip, r3, lsr #1] + cmp r3, #0 + bxeq lr + b sh2_sdram_checks +#else + bx lr +#endif + +sh2_write16_da: + @ preserve r0 and r2 for tail call
View file
libretro-picodrive-0~git20200112.tar.xz/pico/32x/pwm.c -> libretro-picodrive-0~git20200716.tar.xz/pico/32x/pwm.c
Changed
@@ -7,32 +7,42 @@ */ #include "../pico_int.h" -static int pwm_cycles; -static int pwm_mult; -static int pwm_ptr; -static int pwm_irq_reload; -static int pwm_doing_fifo; -static int pwm_silent; +static struct { + int cycles; + unsigned mult; + int ptr; + int irq_reload; + int doing_fifo; + int silent; + int irq_timer; + int irq_state; + short current[2]; +} pwm; + +enum { PWM_IRQ_LOCKED, PWM_IRQ_STOPPED, PWM_IRQ_LOW, PWM_IRQ_HIGH }; void p32x_pwm_ctl_changed(void) { int control = Pico32x.regs[0x30 / 2]; int cycles = Pico32x.regs[0x32 / 2]; + int pwm_irq_opt = PicoIn.opt & POPT_PWM_IRQ_OPT; cycles = (cycles - 1) & 0x0fff; - pwm_cycles = cycles; + pwm.cycles = cycles; // supposedly we should stop FIFO when xMd is 0, // but mars test disagrees - pwm_mult = 0; + pwm.mult = 0; if ((control & 0x0f) != 0) - pwm_mult = 0x10000 / cycles; + pwm.mult = 0x10000 / cycles; - pwm_irq_reload = (control & 0x0f00) >> 8; - pwm_irq_reload = ((pwm_irq_reload - 1) & 0x0f) + 1; + pwm.irq_timer = (control & 0x0f00) >> 8; + pwm.irq_timer = ((pwm.irq_timer - 1) & 0x0f) + 1; + pwm.irq_reload = pwm.irq_timer; + pwm.irq_state = pwm_irq_opt ? PWM_IRQ_STOPPED: PWM_IRQ_LOCKED; if (Pico32x.pwm_irq_cnt == 0) - Pico32x.pwm_irq_cnt = pwm_irq_reload; + Pico32x.pwm_irq_cnt = pwm.irq_reload; } static void do_pwm_irq(SH2 *sh2, unsigned int m68k_cycles) @@ -40,7 +50,7 @@ p32x_trigger_irq(sh2, m68k_cycles, P32XI_PWM); if (Pico32x.regs[0x30 / 2] & P32XP_RTP) { - p32x_event_schedule(m68k_cycles, P32X_EVENT_PWM, pwm_cycles / 3 + 1); + p32x_event_schedule(m68k_cycles, P32X_EVENT_PWM, pwm.cycles / 3 + 1); // note: might recurse p32x_dreq1_trigger(); } @@ -48,16 +58,16 @@ static int convert_sample(unsigned int v) { + if (v > pwm.cycles) + v = pwm.cycles; if (v == 0) return 0; - if (v > pwm_cycles) - v = pwm_cycles; - return ((int)v - pwm_cycles / 2) * pwm_mult; + return v * pwm.mult - 0x10000/2; } #define consume_fifo(sh2, m68k_cycles) { \ int cycles_diff = ((m68k_cycles) * 3) - Pico32x.pwm_cycle_p; \ - if (cycles_diff >= pwm_cycles) \ + if (cycles_diff >= pwm.cycles) \ consume_fifo_do(sh2, m68k_cycles, cycles_diff); \ } @@ -69,67 +79,70 @@ unsigned short *fifo_r = mem->pwm_fifo[1]; int sum = 0; - if (pwm_cycles == 0 || pwm_doing_fifo) + if (pwm.cycles == 0 || pwm.doing_fifo) return; elprintf(EL_PWM, "pwm: %u: consume %d/%d, %d,%d ptr %d", - m68k_cycles, sh2_cycles_diff, sh2_cycles_diff / pwm_cycles, - Pico32x.pwm_p[0], Pico32x.pwm_p[1], pwm_ptr); + m68k_cycles, sh2_cycles_diff, sh2_cycles_diff / pwm.cycles, + Pico32x.pwm_p[0], Pico32x.pwm_p[1], pwm.ptr); // this is for recursion from dreq1 writes - pwm_doing_fifo = 1; + pwm.doing_fifo = 1; - for (; sh2_cycles_diff >= pwm_cycles; sh2_cycles_diff -= pwm_cycles) + while (sh2_cycles_diff >= pwm.cycles) { + sh2_cycles_diff -= pwm.cycles; + if (Pico32x.pwm_p[0] > 0) { - fifo_l[0] = fifo_l[1]; - fifo_l[1] = fifo_l[2]; - fifo_l[2] = fifo_l[3]; + mem->pwm_index[0] = (mem->pwm_index[0]+1) % 4; Pico32x.pwm_p[0]--; - mem->pwm_current[0] = convert_sample(fifo_l[0]); - sum += mem->pwm_current[0]; + pwm.current[0] = convert_sample(fifo_l[mem->pwm_index[0]]); + sum |= (u16)pwm.current[0]; } if (Pico32x.pwm_p[1] > 0) { - fifo_r[0] = fifo_r[1]; - fifo_r[1] = fifo_r[2]; - fifo_r[2] = fifo_r[3]; + mem->pwm_index[1] = (mem->pwm_index[1]+1) % 4; Pico32x.pwm_p[1]--; - mem->pwm_current[1] = convert_sample(fifo_r[0]); - sum += mem->pwm_current[1]; + pwm.current[1] = convert_sample(fifo_r[mem->pwm_index[1]]); + sum |= (u16)pwm.current[1]; } - mem->pwm[pwm_ptr * 2 ] = mem->pwm_current[0]; - mem->pwm[pwm_ptr * 2 + 1] = mem->pwm_current[1]; - pwm_ptr = (pwm_ptr + 1) & (PWM_BUFF_LEN - 1); + mem->pwm[pwm.ptr * 2 ] = pwm.current[0]; + mem->pwm[pwm.ptr * 2 + 1] = pwm.current[1]; + pwm.ptr = (pwm.ptr + 1) & (PWM_BUFF_LEN - 1); if (--Pico32x.pwm_irq_cnt == 0) { - Pico32x.pwm_irq_cnt = pwm_irq_reload; + Pico32x.pwm_irq_cnt = pwm.irq_reload; do_pwm_irq(sh2, m68k_cycles); + } else if (Pico32x.pwm_p[1] == 0 && pwm.irq_state >= PWM_IRQ_LOW) { + // buffer underrun. Reduce reload rate if above programmed setting. + if (pwm.irq_reload > pwm.irq_timer) + pwm.irq_reload--; + pwm.irq_state = PWM_IRQ_LOW; } } Pico32x.pwm_cycle_p = m68k_cycles * 3 - sh2_cycles_diff; - pwm_doing_fifo = 0; + pwm.doing_fifo = 0; if (sum != 0) - pwm_silent = 0; + pwm.silent = 0; } static int p32x_pwm_schedule_(SH2 *sh2, unsigned int m68k_now) { - unsigned int sh2_now = m68k_now * 3; + unsigned int pwm_now = m68k_now * 3; int cycles_diff_sh2; - if (pwm_cycles == 0) + if (pwm.cycles == 0) return 0; - cycles_diff_sh2 = sh2_now - Pico32x.pwm_cycle_p; - if (cycles_diff_sh2 >= pwm_cycles) + cycles_diff_sh2 = pwm_now - Pico32x.pwm_cycle_p; + if (cycles_diff_sh2 >= pwm.cycles) consume_fifo_do(sh2, m68k_now, cycles_diff_sh2); if (!((Pico32x.sh2irq_mask[0] | Pico32x.sh2irq_mask[1]) & 1)) return 0; // masked by everyone - cycles_diff_sh2 = sh2_now - Pico32x.pwm_cycle_p; - return (Pico32x.pwm_irq_cnt * pwm_cycles + cycles_diff_sh2 = pwm_now - Pico32x.pwm_cycle_p; + return (Pico32x.pwm_irq_cnt * pwm.cycles - cycles_diff_sh2) / 3 + 1; } @@ -166,21 +179,21 @@ consume_fifo(sh2, m68k_cycles); a &= 0x0e; - switch (a) { - case 0: // control - case 2: // cycle + switch (a/2) { + case 0/2: // control + case 2/2: // cycle d = Pico32x.regs[(0x30 + a) / 2]; break; - case 4: // L ch + case 4/2: // L ch if (Pico32x.pwm_p[0] == 3) d |= P32XP_FULL;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/32x/sh2soc.c -> libretro-picodrive-0~git20200716.tar.xz/pico/32x/sh2soc.c
Changed
@@ -25,6 +25,9 @@ #include "../pico_int.h" #include "../memory.h" +#include "../../cpu/sh2/compiler.h" +DRC_DECLARE_SR; + // DMAC handling struct dma_chan { unsigned int sar, dar; // src, dst addr @@ -87,6 +90,7 @@ case 0: d = p32x_sh2_read8(chan->sar, sh2); p32x_sh2_write8(chan->dar, d, sh2); + break; case 1: d = p32x_sh2_read16(chan->sar, sh2); p32x_sh2_write16(chan->dar, d, sh2); @@ -125,6 +129,33 @@ chan->sar += size; } +// optimization for copying around memory with SH2 DMA +static void dmac_memcpy(struct dma_chan *chan, SH2 *sh2) +{ + u32 size = (chan->chcr >> 10) & 3, up = chan->chcr & (1 << 14); + int count; + + if (!up || chan->tcr < 4) + return; +#if MARS_CHECK_HACK + // XXX Mars Check Program copies 32K longwords (128KB) from a 64KB buffer in + // ROM or DRAM to SDRAM in 4-longword mode, overwriting an SDRAM comm area in + // turn, which crashes the test on emulators without CPU cache emulation. + // This may be a bug in Mars Check. As a kludge limit the transfer to 64KB, + // which is what the check program test uses for checking the result. + // A better way would clearly be to have a mechanism to patch the ROM... + if (size == 3 && chan->tcr == 32768 && chan->dar == 0x06020000) size = 1; +#endif + if (size == 3) size = 2; // 4-word xfer mode still counts in words + // XXX check TCR being a multiple of 4 in 4-word xfer mode? + // XXX check alignment of sar/dar, generating a bus error if unaligned? + count = p32x_sh2_memcpy(chan->dar, chan->sar, chan->tcr, 1 << size, sh2); + + chan->sar += count << size; + chan->dar += count << size; + chan->tcr -= count; +} + // DMA trigger by SH2 register write static void dmac_trigger(SH2 *sh2, struct dma_chan *chan) { @@ -134,6 +165,12 @@ if (chan->chcr & DMA_AR) { // auto-request transfer + sh2->state |= SH2_STATE_SLEEP; + if ((((chan->chcr >> 12) ^ (chan->chcr >> 14)) & 3) == 0 && + (((chan->chcr >> 14) ^ (chan->chcr >> 15)) & 1) == 1) { + // SM == DM and either DM0 or DM1 are set. check for mem to mem copy + dmac_memcpy(chan, sh2); + } while ((int)chan->tcr > 0) dmac_transfer_one(sh2, chan); dmac_transfer_complete(sh2, chan); @@ -160,8 +197,9 @@ } // timer state - FIXME -static int timer_cycles[2]; -static int timer_tick_cycles[2]; +static u32 timer_cycles[2]; +static u32 timer_tick_cycles[2]; +static u32 timer_tick_factor[2]; // timers void p32x_timers_recalc(void) @@ -171,6 +209,9 @@ // SH2 timer step for (i = 0; i < 2; i++) { + sh2s[i].state &= ~SH2_TIMER_RUN; + if (PREG8(sh2s[i].peri_regs, 0x80) & 0x20) // TME + sh2s[i].state |= SH2_TIMER_RUN; tmp = PREG8(sh2s[i].peri_regs, 0x80) & 7; // Sclk cycles per timer tick if (tmp) @@ -178,36 +219,35 @@ else cycles = 2; timer_tick_cycles[i] = cycles; + timer_tick_factor[i] = (1ULL << 32) / cycles; timer_cycles[i] = 0; elprintf(EL_32XP, "WDT cycles[%d] = %d", i, cycles); } } -void p32x_timers_do(unsigned int m68k_slice) +NOINLINE void p32x_timer_do(SH2 *sh2, unsigned int m68k_slice) { unsigned int cycles = m68k_slice * 3; - int cnt, i; - - // WDT timers - for (i = 0; i < 2; i++) { - void *pregs = sh2s[i].peri_regs; - if (PREG8(pregs, 0x80) & 0x20) { // TME - timer_cycles[i] += cycles; - cnt = PREG8(pregs, 0x81); - while (timer_cycles[i] >= timer_tick_cycles[i]) { - timer_cycles[i] -= timer_tick_cycles[i]; - cnt++; - } - if (cnt >= 0x100) { - int level = PREG8(pregs, 0xe3) >> 4; - int vector = PREG8(pregs, 0xe4) & 0x7f; - elprintf(EL_32XP, "%csh2 WDT irq (%d, %d)", - i ? 's' : 'm', level, vector); - sh2_internal_irq(&sh2s[i], level, vector); - cnt &= 0xff; - } - PREG8(pregs, 0x81) = cnt; + void *pregs = sh2->peri_regs; + int cnt; int i = sh2->is_slave; + + // WDT timer + timer_cycles[i] += cycles; + if (timer_cycles[i] > timer_tick_cycles[i]) { + // cnt = timer_cycles[i] / timer_tick_cycles[i]; + cnt = (1ULL * timer_cycles[i] * timer_tick_factor[i]) >> 32; + timer_cycles[i] -= timer_tick_cycles[i] * cnt; + + cnt += PREG8(pregs, 0x81); + if (cnt >= 0x100) { + int level = PREG8(pregs, 0xe3) >> 4; + int vector = PREG8(pregs, 0xe4) & 0x7f; + elprintf(EL_32XP, "%csh2 WDT irq (%d, %d)", + i ? 's' : 'm', level, vector); + sh2_internal_irq(sh2, level, vector); + cnt &= 0xff; } + PREG8(pregs, 0x81) = cnt; } } @@ -225,7 +265,7 @@ // SH2 internal peripheral memhandlers // we keep them in little endian format -u32 sh2_peripheral_read8(u32 a, SH2 *sh2) +u32 REGPARM(2) sh2_peripheral_read8(u32 a, SH2 *sh2) { u8 *r = (void *)sh2->peri_regs; u32 d; @@ -235,30 +275,52 @@ elprintf_sh2(sh2, EL_32XP, "peri r8 [%08x] %02x @%06x", a | ~0x1ff, d, sh2_pc(sh2)); + if ((a & 0x1c0) == 0x140) { + // abused as comm area + DRC_SAVE_SR(sh2); + p32x_sh2_poll_detect(a, sh2, SH2_STATE_CPOLL, 3); + DRC_RESTORE_SR(sh2); + } return d; } -u32 sh2_peripheral_read16(u32 a, SH2 *sh2) +u32 REGPARM(2) sh2_peripheral_read16(u32 a, SH2 *sh2) { u16 *r = (void *)sh2->peri_regs; u32 d; - a &= 0x1ff; + a &= 0x1fe; d = r[(a / 2) ^ 1]; elprintf_sh2(sh2, EL_32XP, "peri r16 [%08x] %04x @%06x", a | ~0x1ff, d, sh2_pc(sh2)); + if ((a & 0x1c0) == 0x140) { + // abused as comm area + DRC_SAVE_SR(sh2); + p32x_sh2_poll_detect(a, sh2, SH2_STATE_CPOLL, 3); + DRC_RESTORE_SR(sh2); + } return d; } -u32 sh2_peripheral_read32(u32 a, SH2 *sh2) +u32 REGPARM(2) sh2_peripheral_read32(u32 a, SH2 *sh2) { u32 d; + a &= 0x1fc; d = sh2->peri_regs[a / 4]; elprintf_sh2(sh2, EL_32XP, "peri r32 [%08x] %08x @%06x", a | ~0x1ff, d, sh2_pc(sh2)); + if (a == 0x18c) + // kludge for polling COMM while polling for end of DMA
View file
libretro-picodrive-0~git20200112.tar.xz/pico/arm_features.h -> libretro-picodrive-0~git20200716.tar.xz/pico/arm_features.h
Changed
@@ -49,4 +49,32 @@ #endif +// indexed branch (XB) via branch table (BT) +#ifdef __PIC__ +#define PIC_XB(c,r,s) add##c pc, r, s +#define PIC_BT(a) b a +#else +#define PIC_XB(c,r,s) ldr##c pc, [pc, r, s] +#define PIC_BT(a) .word a +#endif + +// load data address (LDR) either via literal pool or via GOT +#ifdef __PIC__ +// can't use pool loads since ldr= only allows a symbol or a constant expr :-( +#define PIC_LDR_INIT() \ + .macro pic_ldr r t a; \ + ldr \r, [pc, $.LD\@-.-8]; \ + ldr \t, [pc, $.LD\@-.-4]; \ + .LP\@:add \r, pc; \ + ldr \r, [\r, \t]; \ + add pc, $4; \ + .LD\@:.word _GLOBAL_OFFSET_TABLE_-.LP\@-8; \ + .word \a(GOT); \ + .endm; +#define PIC_LDR(r,t,a) pic_ldr r, t, a +#else +#define PIC_LDR_INIT() +#define PIC_LDR(r,t,a) ldr r, =a +#endif + #endif /* __ARM_FEATURES_H__ */
View file
libretro-picodrive-0~git20200112.tar.xz/pico/carthw/carthw.h -> libretro-picodrive-0~git20200716.tar.xz/pico/carthw/carthw.h
Changed
@@ -1,5 +1,6 @@ /* svp */ +#include "../pico_types.h" #include "svp/ssp16.h" typedef struct { @@ -18,7 +19,7 @@ extern int carthw_ssf2_active; extern unsigned char carthw_ssf2_banks[8]; void carthw_ssf2_startup(void); -void carthw_ssf2_write8(unsigned int a, unsigned int d); +void carthw_ssf2_write8(u32 a, u32 d); /* misc */ void carthw_Xin1_startup(void);
View file
libretro-picodrive-0~git20200112.tar.xz/pico/carthw/svp/compiler.c -> libretro-picodrive-0~git20200716.tar.xz/pico/carthw/svp/compiler.c
Changed
@@ -693,9 +693,9 @@ /* spacial version of call for calling C needed on ios, since we use r9.. */ static void emith_call_c_func(void *target) { - EOP_STMFD_SP(A_R7M|A_R9M); + EOP_STMFD_SP(M2(7,9)); emith_call(target); - EOP_LDMFD_SP(A_R7M|A_R9M); + EOP_LDMFD_SP(M2(7,9)); } #else #define emith_call_c_func emith_call @@ -1438,12 +1438,9 @@ } tr_mov16(0, *pc); tr_r0_to_STACK(*pc); - if (tmpv != A_COND_AL) { - u32 *real_ptr = tcache_ptr; - tcache_ptr = jump_op; - EOP_C_B(tr_neg_cond(tmpv),0,real_ptr - jump_op - 2); - tcache_ptr = real_ptr; - } + if (tmpv != A_COND_AL) + EOP_C_B_PTR(jump_op, tr_neg_cond(tmpv), 0, + tcache_ptr - jump_op - 2); tr_mov16_cond(tmpv, 0, imm); if (tmpv != A_COND_AL) tr_mov16_cond(tr_neg_cond(tmpv), 0, *pc); @@ -1712,12 +1709,8 @@ ssp_block_table[pc]; if (target != NULL) emith_jump(target); - else { - int ops = emith_jump(ssp_drc_next); - end_ptr = tcache_ptr; - // cause the next block to be emitted over jump instruction - tcache_ptr -= ops; - } + else + emith_jump(ssp_drc_next); } else { u32 *target1 = (pc < 0x400) ? @@ -1795,6 +1788,8 @@ tr_flush_dirty_ST(); tr_flush_dirty_pmcrs(); block_end = emit_block_epilogue(ccount, end_cond, jump_pc, pc); + emith_pool_commit(0); + emith_flush(); if (tcache_ptr - (u32 *)tcache > DRC_TCACHE_SIZE/4) { elprintf(EL_ANOMALY|EL_STATUS|EL_SVP, "tcache overflow!\n");
View file
libretro-picodrive-0~git20200112.tar.xz/pico/carthw/svp/stub_arm.S -> libretro-picodrive-0~git20200716.tar.xz/pico/carthw/svp/stub_arm.S
Changed
@@ -8,7 +8,7 @@ #include "../../arm_features.h" -.syntax unified +@.syntax unified .text .align 2 @@ -281,8 +281,8 @@ bgt ssp_hle_902_loop tst r12, #1 - ldrhne r0, [r2], #2 - strhne r0, [r3], #2 + ldrneh r0, [r2], #2 + strneh r0, [r3], #2 ldr r0, [r7, #SSP_OFFS_IRAM_ROM] add r1, r7, #0x200 @@ -501,7 +501,7 @@ mov r12, #0x4000 orr r12,r12,#0x0018 subs r12,r3, r12 - subsne r12,r12,#0x0400 + subnes r12,r12,#0x0400 blne tr_unhandled orr r2, r2, r2, lsl #16 @@ -510,7 +510,7 @@ hle_07_036_no_ovrwr: tst r1, #2 - strhne r2, [r1], #0x3e @ align + strneh r2, [r1], #0x3e @ align subne r0, r0, #1 subs r0, r0, #4 blt hle_07_036_l2 @@ -525,7 +525,7 @@ tst r0, #2 strne r2, [r1], #0x40 tst r0, #1 - strhne r2, [r1], #2 + strneh r2, [r1], #2 b hle_07_036_end_copy hle_07_036_ovrwr: @@ -562,10 +562,10 @@ hle_07_036_ol2: tst r0, #1 - ldrhne r3, [r1] + ldrneh r3, [r1] andne r3, r3, r12 orrne r3, r3, r2 - strhne r3, [r1], #2 + strneh r3, [r1], #2 hle_07_036_end_copy: ldr r2, [r7, #SSP_OFFS_DRAM]
View file
libretro-picodrive-0~git20200112.tar.xz/pico/cd/gfx_dma.c -> libretro-picodrive-0~git20200716.tar.xz/pico/cd/gfx_dma.c
Changed
@@ -10,10 +10,6 @@ #include "cell_map.c" -#ifndef UTYPES_DEFINED -typedef unsigned short u16; -#endif - // check: Heart of the alien, jaguar xj 220 PICO_INTERNAL void DmaSlowCell(unsigned int source, unsigned int a, int len, unsigned char inc) { @@ -32,7 +28,7 @@ asrc = cell_map(source >> 2) << 2; asrc |= source & 2; // if(a&1) d=(d<<8)|(d>>8); // ?? - r[a>>1] = *(u16 *)(base + asrc); + VideoWriteVRAM(a, *(u16 *)(base + asrc)); source += 2; // AutoIncrement a=(u16)(a+inc);
View file
libretro-picodrive-0~git20200112.tar.xz/pico/cd/mcd.c -> libretro-picodrive-0~git20200716.tar.xz/pico/cd/mcd.c
Changed
@@ -125,6 +125,7 @@ if (SekShouldInterrupt()) Pico_mcd->m.s68k_poll_a = 0; + pprof_start(s68k); SekCycleCntS68k += cyc_do; #if defined(EMU_C68K) PicoCpuCS68k.cycles = cyc_do; @@ -137,6 +138,7 @@ #elif defined(EMU_F68K) SekCycleCntS68k += fm68k_emulate(&PicoCpuFS68k, cyc_do, 0) - cyc_do; #endif + pprof_end(s68k); } static void pcd_set_cycle_mult(void)
View file
libretro-picodrive-0~git20200112.tar.xz/pico/cd/memory.c -> libretro-picodrive-0~git20200716.tar.xz/pico/cd/memory.c
Changed
@@ -14,12 +14,14 @@ uptr s68k_write8_map [0x1000000 >> M68K_MEM_SHIFT]; uptr s68k_write16_map[0x1000000 >> M68K_MEM_SHIFT]; +#ifndef _ASM_CD_MEMORY_C MAKE_68K_READ8(s68k_read8, s68k_read8_map) MAKE_68K_READ16(s68k_read16, s68k_read16_map) MAKE_68K_READ32(s68k_read32, s68k_read16_map) MAKE_68K_WRITE8(s68k_write8, s68k_write8_map) MAKE_68K_WRITE16(s68k_write16, s68k_write16_map) MAKE_68K_WRITE32(s68k_write32, s68k_write16_map) +#endif // -----------------------------------------------------------------
View file
libretro-picodrive-0~git20200112.tar.xz/pico/cd/memory_arm.S -> libretro-picodrive-0~git20200716.tar.xz/pico/cd/memory_arm.S
Changed
@@ -6,7 +6,8 @@ @* See COPYING file in the top-level directory. @* -#include "../pico_int_o32.h" +#include "../arm_features.h" +#include "../pico_int_offs.h" .equiv PCM_STEP_SHIFT, 11 @@ -65,6 +66,7 @@ .extern PicoWrite16_io .extern m68k_comm_check + PIC_LDR_INIT() @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@ -73,16 +75,16 @@ @ r0=addr[in,out], r1,r2=tmp .macro cell_map ands r1, r0, #0x01c000 - ldrne pc, [pc, r1, lsr #12] - beq 0f @ most common? - .long 0f - .long 0f - .long 0f - .long 0f - .long 1f - .long 1f - .long 2f - .long 3f + PIC_XB(ne ,r1, lsr #12) + b 0f @ most common? + PIC_BT(0f) + PIC_BT(0f) + PIC_BT(0f) + PIC_BT(0f) + PIC_BT(1f) + PIC_BT(1f) + PIC_BT(2f) + PIC_BT(3f) 1: @ x16 cells and r1, r0, #0x7e00 @ col and r2, r0, #0x01fc @ row @@ -128,7 +130,7 @@ mov r3, #0x0e0000 0: cell_map - ldr r1, =Pico + PIC_LDR(r1, r2, Pico) add r0, r0, r3 ldr r1, [r1, #OFS_Pico_rom] @ Pico.mcd (used everywhere) eor r0, r0, #1 @@ -141,26 +143,26 @@ cmp r1, #0x2000 @ a120xx? bne PicoRead8_io - ldr r1, =Pico + PIC_LDR(r1, r2, Pico) and r0, r0, #0x3f ldr r1, [r1, #OFS_Pico_rom] @ Pico.mcd cmp r0, #0x0e - ldrlt pc, [pc, r0, lsl #2] + PIC_XB(lt ,r0, lsl #2) b m_m68k_read8_hi - .long m_m68k_read8_r00 - .long m_m68k_read8_r01 - .long m_m68k_read8_r02 - .long m_m68k_read8_r03 - .long m_m68k_read8_r04 - .long m_read_null @ unused bits - .long m_m68k_read8_r06 - .long m_m68k_read8_r07 - .long m_m68k_read8_r08 - .long m_m68k_read8_r09 - .long m_read_null @ reserved - .long m_read_null - .long m_m68k_read8_r0c - .long m_m68k_read8_r0d + PIC_BT(m_m68k_read8_r00) + PIC_BT(m_m68k_read8_r01) + PIC_BT(m_m68k_read8_r02) + PIC_BT(m_m68k_read8_r03) + PIC_BT(m_m68k_read8_r04) + PIC_BT(m_read_null) @ unused bits + PIC_BT(m_m68k_read8_r06) + PIC_BT(m_m68k_read8_r07) + PIC_BT(m_m68k_read8_r08) + PIC_BT(m_m68k_read8_r09) + PIC_BT(m_read_null) @ reserved + PIC_BT(m_read_null) + PIC_BT(m_m68k_read8_r0c) + PIC_BT(m_m68k_read8_r0d) m_m68k_read8_r00: add r1, r1, #0x110000 ldr r0, [r1, #0x30] @@ -178,9 +180,9 @@ bx lr m_m68k_read8_r03: add r1, r1, #0x110000 - push {r1, lr} + stmfd sp!, {r1, lr} bl m68k_comm_check - pop {r1, lr} + ldmfd sp!, {r1, lr} ldrb r0, [r1, #3] and r0, r0, #0xc7 bx lr @@ -219,10 +221,10 @@ add r1, r1, #0x110000 movge r0, #0 bxge lr - add r1, r0 - push {r1, lr} + add r1, r1, r0 + stmfd sp!, {r1, lr} bl m68k_comm_check - pop {r1, lr} + ldmfd sp!, {r1, lr} ldrb r0, [r1] bx lr @@ -238,7 +240,7 @@ mov r3, #0x0e0000 0: cell_map - ldr r1, =Pico + PIC_LDR(r1, r2, Pico) add r0, r0, r3 ldr r1, [r1, #OFS_Pico_rom] @ Pico.mcd bic r0, r0, #1 @@ -252,19 +254,19 @@ bne PicoRead16_io m_m68k_read16_m68k_regs: - ldr r1, =Pico + PIC_LDR(r1, r2, Pico) and r0, r0, #0x3e ldr r1, [r1, #OFS_Pico_rom] @ Pico.mcd cmp r0, #0x0e - ldrlt pc, [pc, r0, lsl #1] + PIC_XB(lt ,r0, lsl #1) b m_m68k_read16_hi - .long m_m68k_read16_r00 - .long m_m68k_read16_r02 - .long m_m68k_read16_r04 - .long m_m68k_read16_r06 - .long m_m68k_read16_r08 - .long m_read_null @ reserved - .long m_m68k_read16_r0c + PIC_BT(m_m68k_read16_r00) + PIC_BT(m_m68k_read16_r02) + PIC_BT(m_m68k_read16_r04) + PIC_BT(m_m68k_read16_r06) + PIC_BT(m_m68k_read16_r08) + PIC_BT(m_read_null) @ reserved + PIC_BT(m_m68k_read16_r0c) m_m68k_read16_r00: add r1, r1, #0x110000 ldr r0, [r1, #0x30] @@ -275,9 +277,9 @@ bx lr m_m68k_read16_r02: add r1, r1, #0x110000 - push {r1, lr} + stmfd sp!, {r1, lr} bl m68k_comm_check - pop {r1, lr} + ldmfd sp!, {r1, lr} ldrb r2, [r1, #3] ldrb r0, [r1, #2] and r2, r2, #0xc7 @@ -307,9 +309,9 @@ bxge lr add r1, r0, r1 - push {r1, lr} + stmfd sp!, {r1, lr} bl m68k_comm_check - pop {r0, lr} + ldmfd sp!, {r0, lr} ldrh r0, [r0] mov r1, r0, lsr #8 and r0, r0, #0xff @@ -329,7 +331,7 @@ 0: mov r3, r1 cell_map - ldr r2, =Pico + PIC_LDR(r2, r1, Pico) add r0, r0, r12 ldr r2, [r2, #OFS_Pico_rom] @ Pico.mcd ldr r2, [r2] @@ -357,7 +359,7 @@ 0: mov r3, r1 cell_map - ldr r1, =Pico + PIC_LDR(r1, r2, Pico)
View file
libretro-picodrive-0~git20200112.tar.xz/pico/debug.c -> libretro-picodrive-0~git20200716.tar.xz/pico/debug.c
Changed
@@ -43,6 +43,12 @@ !!(Pico.sv.flags & SRF_ENABLED), !!(Pico.sv.flags & SRF_EEPROM), Pico.sv.eeprom_type); MVP; sprintf(dstrp, "sram range: %06x-%06x, reg: %02x\n", Pico.sv.start, Pico.sv.end, Pico.m.sram_reg); MVP; sprintf(dstrp, "pend int: v:%i, h:%i, vdp status: %04x\n", bit(pv->pending_ints,5), bit(pv->pending_ints,4), pv->status); MVP; + sprintf(dstrp, "VDP regs 00-07: %02x %02x %02x %02x %02x %02x %02x %02x\n",reg[0],reg[1],reg[2],reg[3],reg[4],reg[5],reg[6],reg[7]); MVP; + sprintf(dstrp, "VDP regs 08-0f: %02x %02x %02x %02x %02x %02x %02x %02x\n",reg[8],reg[9],reg[10],reg[11],reg[12],reg[13],reg[14],reg[15]); MVP; + sprintf(dstrp, "VDP regs 10-17: %02x %02x %02x %02x %02x %02x %02x %02x\n",reg[16],reg[17],reg[18],reg[19],reg[20],reg[21],reg[22],reg[23]); MVP; + sprintf(dstrp, "VDP regs 18-1f: %02x %02x %02x %02x %02x %02x %02x %02x\n",reg[24],reg[25],reg[26],reg[27],reg[28],reg[29],reg[30],reg[31]); MVP; + r = (reg[5]<<9)+(reg[6]<<11); + sprintf(dstrp, "sprite #0: %04x %04x %04x %04x\n",PicoMem.vram[r/2],PicoMem.vram[r/2+1],PicoMem.vram[r/2+2],PicoMem.vram[r/2+3]); MVP; sprintf(dstrp, "pal: %i, hw: %02x, frame#: %i, cycles: %u\n", Pico.m.pal, Pico.m.hardware, Pico.m.frame_count, SekCyclesDone()); MVP; sprintf(dstrp, "M68k: PC: %06x, SR: %04x, irql: %i\n", SekPc, SekSr, SekIrqLevel); MVP; for (r = 0; r < 8; r++) { @@ -369,42 +375,32 @@ void PDebugZ80Frame(void) { - int lines, line_sample; + int lines; if (PicoIn.AHW & PAHW_SMS) return; - if (Pico.m.pal) { + if (Pico.m.pal) lines = 313; - line_sample = 68; - } else { + else lines = 262; - line_sample = 93; - } z80_resetCycles(); PsndStartFrame(); - if (/*Pico.m.z80Run &&*/ !Pico.m.z80_reset && (PicoIn.opt&POPT_EN_Z80)) - PicoSyncZ80(Pico.t.m68c_cnt + line_sample * 488); - if (PicoIn.sndOut) - PsndGetSamples(line_sample); - if (/*Pico.m.z80Run &&*/ !Pico.m.z80_reset && (PicoIn.opt&POPT_EN_Z80)) { PicoSyncZ80(Pico.t.m68c_cnt + 224 * 488); z80_int(); } - if (PicoIn.sndOut) - PsndGetSamples(224); // sync z80 if (/*Pico.m.z80Run &&*/ !Pico.m.z80_reset && (PicoIn.opt&POPT_EN_Z80)) { Pico.t.m68c_cnt += Pico.m.pal ? 151809 : 127671; // cycles adjusted for converter PicoSyncZ80(Pico.t.m68c_cnt); } - if (PicoIn.sndOut && ym2612.dacen && Pico.snd.dac_line < lines) - PsndDoDAC(lines - 1); - PsndDoPSG(lines - 1); + + if (PicoIn.sndOut) + PsndGetSamples(lines); timers_cycle(); Pico.t.m68c_aim = Pico.t.m68c_cnt;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/draw.c -> libretro-picodrive-0~git20200716.tar.xz/pico/draw.c
Changed
@@ -2,6 +2,7 @@ * line renderer * (c) Copyright Dave, 2004 * (C) notaz, 2006-2010 + * (C) kub, 2019-2020 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -29,6 +30,7 @@ */ #include "pico_int.h" +#define FORCE // layer forcing via debug register? int (*PicoScanBegin)(unsigned int num) = NULL; int (*PicoScanEnd) (unsigned int num) = NULL; @@ -45,6 +47,8 @@ static int HighCacheB[41+1]; static int HighPreSpr[80*2+1]; // slightly preprocessed sprites +unsigned int VdpSATCache[128]; // VDP sprite cache (1st 32 sprite attr bits) + #define LF_PLANE_1 (1 << 0) #define LF_SH (1 << 1) // must be = 2 #define LF_FORCE (1 << 2) @@ -53,7 +57,11 @@ #define SPRL_HAVE_LO 0x40 // *lo* #define SPRL_MAY_HAVE_OP 0x20 // may have operator sprites on the line #define SPRL_LO_ABOVE_HI 0x10 // low priority sprites may be on top of hi -unsigned char HighLnSpr[240][3 + MAX_LINE_SPRITES]; // sprite_count, ^flags, tile_count, [spritep]... +#define SPRL_HAVE_X 0x08 // have sprites with x != 0 +#define SPRL_TILE_OVFL 0x04 // tile limit exceeded on previous line +#define SPRL_HAVE_MASK0 0x02 // have sprite with x == 0 in 1st slot +#define SPRL_MASKED 0x01 // lo prio masking by sprite with x == 0 active +unsigned char HighLnSpr[240][4+MAX_LINE_SPRITES+1]; // sprite_count, ^flags, tile_count, sprites_total, [spritep]..., last_width int rendstatus_old; int rendlines; @@ -94,7 +102,7 @@ #define blockcpy memcpy #endif -#define TileNormMaker_(pix_func) \ +#define TileNormMaker_(pix_func,ret) \ { \ unsigned int t; \ \ @@ -106,9 +114,10 @@ t = (pack&0x0f000000)>>24; pix_func(5); \ t = (pack&0x00f00000)>>20; pix_func(6); \ t = (pack&0x000f0000)>>16; pix_func(7); \ + return ret; \ } -#define TileFlipMaker_(pix_func) \ +#define TileFlipMaker_(pix_func,ret) \ { \ unsigned int t; \ \ @@ -120,23 +129,24 @@ t = (pack&0x000000f0)>> 4; pix_func(5); \ t = (pack&0x00000f00)>> 8; pix_func(6); \ t = (pack&0x0000f000)>>12; pix_func(7); \ + return ret; \ } #define TileNormMaker(funcname, pix_func) \ static void funcname(unsigned char *pd, unsigned int pack, int pal) \ -TileNormMaker_(pix_func) +TileNormMaker_(pix_func,) #define TileFlipMaker(funcname, pix_func) \ static void funcname(unsigned char *pd, unsigned int pack, int pal) \ -TileFlipMaker_(pix_func) +TileFlipMaker_(pix_func,) #define TileNormMakerAS(funcname, pix_func) \ -static void funcname(unsigned char *pd, unsigned char *mb, unsigned int pack, int pal) \ -TileNormMaker_(pix_func) +static unsigned funcname(unsigned char *pd, unsigned m, unsigned int pack, int pal) \ +TileNormMaker_(pix_func,m) #define TileFlipMakerAS(funcname, pix_func) \ -static void funcname(unsigned char *pd, unsigned char *mb, unsigned int pack, int pal) \ -TileFlipMaker_(pix_func) +static unsigned funcname(unsigned char *pd, unsigned m, unsigned int pack, int pal) \ +TileFlipMaker_(pix_func,m) #define pix_just_write(x) \ if (t) pd[x]=pal|t @@ -178,17 +188,19 @@ #endif +// AS: sprite mask bits in m shifted to bits 8-15, see DrawSpritesHiAS + // draw a sprite pixel (AS) #define pix_as(x) \ - if (t & mb[x]) mb[x] = 0, pd[x] = pal | t + if (t && (m & (1<<(x+8)))) m &= ~(1<<(x+8)), pd[x] = pal | t TileNormMakerAS(TileNormAS, pix_as) TileFlipMakerAS(TileFlipAS, pix_as) // draw a sprite pixel, process operator colors (AS) #define pix_sh_as(x) \ - if (t & mb[x]) { \ - mb[x] = 0; \ + if (t && (m & (1<<(x+8)))) { \ + m &= ~(1<<(x+8)); \ if (t>=0xe) pd[x]=(pd[x]&0x3f)|(t<<6); /* c0 shadow, 80 hilight */ \ else pd[x] = pal | t; \ } @@ -197,8 +209,8 @@ TileFlipMakerAS(TileFlipSH_AS, pix_sh_as) #define pix_sh_as_onlyop(x) \ - if (t & mb[x]) { \ - mb[x] = 0; \ + if (t && (m & (1<<(x+8)))) { \ + m &= ~(1<<(x+8)); \ pix_sh_onlyop(x); \ } @@ -207,11 +219,12 @@ // mark pixel as sprite pixel (AS) #define pix_sh_as_onlymark(x) \ - if (t) mb[x] = 0 + if (t) m &= ~(1<<(x+8)) TileNormMakerAS(TileNormAS_onlymark, pix_sh_as_onlymark) TileFlipMakerAS(TileFlipAS_onlymark, pix_sh_as_onlymark) +#ifdef FORCE // forced both layer draw (through debug reg) #define pix_and(x) \ pd[x] = (pd[x] & 0xc0) | (pd[x] & (pal | t)) @@ -219,6 +232,18 @@ TileNormMaker(TileNorm_and, pix_and) TileFlipMaker(TileFlip_and, pix_and) +// forced sprite draw (through debug reg) +#define pix_sh_as_and(x) /* XXX is there S/H with forced draw? */ \ + if (m & (1<<(x+8))) { \ + m &= ~(1<<(x+8)); \ + if (t>=0xe) pd[x]=(pd[x]&0x3f)|(t<<6); /* c0 shadow, 80 hilight */ \ + else pd[x] = (pd[x] & 0xc0) | (pd[x] & (pal | t)); \ + } + +TileNormMakerAS(TileNormSH_AS_and, pix_sh_as_and) +TileFlipMakerAS(TileFlipSH_AS_and, pix_sh_as_and) +#endif + // -------------------------------------------- #ifndef _ASM_DRAW_C @@ -293,6 +318,7 @@ int adj = ((ts->hscroll ^ dx) >> 3) & 1; cell -= adj + 1; ts->cells -= adj; + PicoMem.vsram[0x3e] = PicoMem.vsram[0x3f] = plane_sh >> 16; } cell+=cellskip; tilex+=cellskip; @@ -306,7 +332,7 @@ //if((cell&1)==0) { int line,vscroll; - vscroll=PicoMem.vsram[(plane_sh&1)+(cell&~1)]; + vscroll=PicoMem.vsram[(plane_sh&1)+(cell&0x3e)]; // Find the line in the name table line=(vscroll+scan)&ts->line&0xffff; // ts->line is really ymask .. @@ -315,7 +341,7 @@ } code=PicoMem.vram[ts->nametab+nametabadd+(tilex&ts->xmask)]; - if (code==blank) continue; + if ((code<<16|ty)==blank) continue; if (code>>15) { // high priority tile int cval = code | (dx<<16) | (ty<<25); if(code&0x1000) cval^=7<<26; @@ -327,14 +353,15 @@ oldcode = code; // Get tile address/2: addr=(code&0x7ff)<<4; - if (code&0x1000) addr+=14-ty; else addr+=ty; // Y-flip pal=((code>>9)&0x30)|((plane_sh<<5)&0x40); } - pack = *(unsigned int *)(PicoMem.vram + addr); + if (code & 0x1000) ty ^= 0xe; // Y-flip + pack = *(unsigned int *)(PicoMem.vram + addr+ty); + if (!pack) { - blank = code; + blank = code<<16|ty; continue;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/draw2.c -> libretro-picodrive-0~git20200716.tar.xz/pico/draw2.c
Changed
@@ -20,7 +20,7 @@ #define LINE_WIDTH 328 #endif -static unsigned char PicoDraw2FB_[(8+320) * (8+240+8)]; +static unsigned char PicoDraw2FB_[(8+320) * (8+240+8) + 8]; static int HighCache2A[41*(TILE_ROWS+1)+1+1]; // caches for high layers static int HighCache2B[41*(TILE_ROWS+1)+1+1]; @@ -157,6 +157,8 @@ { nametab=(pvid->reg[3]&0x3e)<<9; // 32-cell mode nametab_step = 1<<5; + if (!(PicoIn.opt&POPT_DIS_32C_BORDER)) + scrpos += 32; } nametab += nametab_step*start; @@ -240,6 +242,8 @@ else nametab=(pvid->reg[4]&0x07)<<12; // B scrpos = est->Draw2FB; + if (!(pvid->reg[12]&1) && !(PicoIn.opt&POPT_DIS_32C_BORDER)) + scrpos += 32; scrpos+=8*LINE_WIDTH*(planestart-START_ROW); // Get vertical scroll value: @@ -315,6 +319,8 @@ short blank=-1; // The tile we know is blank unsigned char *scrpos = est->Draw2FB, *pd = 0; + if (!(Pico.video.reg[12]&1) && !(PicoIn.opt&POPT_DIS_32C_BORDER)) + scrpos += 32; // *hcache++ = code|(dx<<16)|(trow<<27); // cache it scrpos+=(*hc++)*LINE_WIDTH - START_ROW*LINE_WIDTH*8; @@ -377,6 +383,8 @@ while(sy <= START_ROW*8) { sy+=8; tile+=tdeltay; height--; } scrpos = est->Draw2FB; + if (!(Pico.video.reg[12]&1) && !(PicoIn.opt&POPT_DIS_32C_BORDER)) + scrpos += 32; scrpos+=(sy-START_ROW*8)*LINE_WIDTH; for (; height > 0; height--, sy+=8, tile+=tdeltay) @@ -412,12 +420,13 @@ int i,u,link=0; unsigned int *sprites[80]; // Sprites int y_min=START_ROW*8, y_max=END_ROW*8; // for a simple sprite masking + int max_sprites = Pico.video.reg[12]&1 ? 80 : 64; table=pvid->reg[5]&0x7f; if (pvid->reg[12]&1) table&=0x7e; // Lowest bit 0 in 40-cell mode table<<=8; // Get sprite table address/2 - for (i=u=0; u < 80; u++) + for (i = u = 0; u < max_sprites && link < max_sprites; u++) { unsigned int *sprite=NULL; int code, code2, sx, sy, height; @@ -502,6 +511,11 @@ maxw = 264; maxcolc = 32; } + // 32C border for centering? (for asm) + est->rendstatus &= ~PDRAW_BORDER_32; + if ((est->rendstatus&PDRAW_32_COLS) && !(PicoIn.opt&POPT_DIS_32C_BORDER)) + est->rendstatus |= PDRAW_BORDER_32; + // horizontal window? if ((win=pvid->reg[0x12])) {
View file
libretro-picodrive-0~git20200112.tar.xz/pico/draw2_arm.S -> libretro-picodrive-0~git20200716.tar.xz/pico/draw2_arm.S
Changed
@@ -8,7 +8,7 @@ * this is highly specialized, be careful if changing related C code! */ -#include "pico_int_o32.h" +#include "pico_int_offs.h" @ define these constants in your include file: @ .equiv START_ROW, 1 @@ -414,7 +414,10 @@ ldr r11,[sp, #9*4] @ est sub r4, r9, #(START_ROW<<24) + ldr r7, [r11, #OFS_EST_rendstatus] ldr r11, [r11, #OFS_EST_Draw2FB] + tst r7, #0x100 @ H32 border mode? + addne r11, r11, #32 mov r4, r4, asr #24 mov r7, #328*8 mla r11, r4, r7, r11 @ scrpos+=8*328*(planestart-START_ROW); @@ -590,8 +593,11 @@ mov r9, #0xff000000 @ r9=prevcode=-1 mvn r6, #0 @ r6=prevy=-1 + ldr r7, [r1, #OFS_EST_rendstatus] ldr r4, [r1, #OFS_EST_Draw2FB] ldr r2, [r0], #4 @ read y offset + tst r7, #0x100 @ H32 border mode? + addne r4, r4, #32 mov r7, #328 mla r2, r7, r2, r4 sub r12, r2, #(328*8*START_ROW) @ r12=scrpos @@ -688,13 +694,18 @@ ldr r4, [r11, #OFS_Pico_video_reg+12] mov r5, #1 @ nametab_step + ldr r11, [r3, #OFS_EST_Draw2FB] tst r4, #1 @ 40 cell mode? andne r12, r12, #0xf000 @ 0x3c<<10 - andeq r12, r12, #0xf800 movne r5, r5, lsl #7 - moveq r5, r5, lsl #6 @ nametab_step - - and r4, r0, #0xff + bne 0f + ldr r7, [r3, #OFS_EST_rendstatus] + and r12, r12, #0xf800 + mov r5, r5, lsl #6 @ nametab_step + tst r7, #0x100 + addne r11, r11, #32 @ center screen in H32 mode + +0: and r4, r0, #0xff mla r12, r5, r4, r12 @ nametab += nametab_step*start; ldr r10, [r3, #OFS_EST_PicoMem_vram] @@ -715,7 +726,6 @@ mov r9, #0xff000000 @ r9=prevcode=-1 - ldr r11, [r3, #OFS_EST_Draw2FB] and r4, r0, #0xff add r11, r11, #328*8 sub r4, r4, #START_ROW @@ -915,8 +925,11 @@ and r3, lr, #0x6000 mov r3, r3, lsr #9 @ r3=pal=((code>>9)&0x30); + ldr r0, [r1, #OFS_EST_rendstatus] ldr r11, [r1, #OFS_EST_Draw2FB] ldr r10, [r1, #OFS_EST_PicoMem_vram] + tst r0, #0x100 @ H32 border mode? + addne r11, r11, #32 sub r1, r12, #(START_ROW*8) mov r0, #328 mla r11, r1, r0, r11 @ scrpos+=(sy-START_ROW*8)*328;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/draw_arm.S -> libretro-picodrive-0~git20200716.tar.xz/pico/draw_arm.S
Changed
@@ -1,6 +1,7 @@ /* * assembly optimized versions of most funtions from draw.c * (C) notaz, 2006-2010,2017 + * (C) kub, 2020 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -8,13 +9,13 @@ * this is highly specialized, be careful if changing related C code! */ -#include "pico_int_o32.h" +#include "pico_int_offs.h" .extern DrawStripInterlace .equ PDRAW_SPRITES_MOVED, (1<<0) .equ PDRAW_WND_DIFF_PRIO, (1<<1) -.equ PDRAW_ACC_SPRITES, (1<<2) +.equ PDRAW_PARSE_SPRITES, (1<<2) .equ PDRAW_DIRTY_SPRITES, (1<<4) .equ PDRAW_PLANE_HI_PRIO, (1<<6) .equ PDRAW_SHHI_DONE, (1<<7) @@ -317,9 +318,10 @@ moveq r1, #0x0007 movgt r1, #0x00ff @ r1=ymask=(height<<8)|0xff; ...; // Y Mask in pixels - add r10, r10, #5 - cmp r10, #7 - subge r10, r10, #1 @ r10=shift[width] (5,6,6,7) + cmp r10, #2 + addlt r10, r10, #5 + moveq r10, #5 + movgt r10, #7 @ r10=shift[width] (5,6,5,7) ldr r2, [r12, #OFS_EST_DrawScanline] ldr lr, [r12, #OFS_EST_PicoMem_vram] @@ -342,11 +344,15 @@ mov r4, r8, lsr #8 @ pvid->reg[13] mov r4, r4, lsl #10 @ htab=pvid->reg[13]<<9; (halfwords) - tst r7, #2 - addne r4, r4, r2, lsl #2 @ htab+=DrawScanline<<1; // Offset by line - tst r7, #1 - biceq r4, r4, #0x1f @ htab&=~0xf; // Offset by tile - add r4, r4, r0, lsl #1 @ htab+=plane + + ands r3, r7, #0x03 + beq 0f + cmp r3, #2 + mov r3, r2, lsl #2 @ htab+=DrawScanline<<1; // Offset by line + biceq r3, #0x1f @ htab&=~0xf; // Offset by tile + andlt r3, #0x1f + add r4, r4, r3 +0: add r4, r4, r0, lsl #1 @ htab+=plane bic r4, r4, #0x00ff0000 @ just in case ldrh r3, [lr, r4] @ r3=hscroll @@ -362,7 +368,8 @@ bne .DrawStrip_interlace tst r0, r0 - movne r7, r7, lsr #16 + moveq r7, r7, lsl #16 + mov r7, r7, lsr #16 @ Find the line in the name table add r2, r2, r7 @@ -517,6 +524,9 @@ @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ .DrawStrip_vsscroll: + tst r8, #1 @ if h40: lflags |= 0x10000 + orrne r0, r0, #0x10000 + rsb r8, r3, #0 mov r8, r8, lsr #3 @ r8=tilex=(-ts->hscroll)>>3 bic r8, r8, #0x3fc00000 @@ -541,6 +551,12 @@ tst r3, #0x08 subne r10,r10, #1<<16 @ cells-- subne r10,r10, #1<<24 @ cell-- // even more negative + + add_c24 r1, lr, (OFS_PMEM_vsram-OFS_PMEM_vram) + tst r0, #0x10000 @ h40? + ldrne r3, [r1, #0x00] @ r3=vsram[0x00..0x01] + ldreq r3, [r1, #0x40] @ r3=vsram[0x20..0x21] + str r3, [r1, #0x7c] @ vsram[0x3e..0x3f]=r3 0: tst r9, #1<<31 mov r3, #0 @@ -571,8 +587,8 @@ @ calc offset and read tileline code to r7, also calc ty add_c24 r7, lr, (OFS_PMEM_vsram-OFS_PMEM_vram) - add r7, r7, r10,asr #23 @ vsram + ((cell&~1)<<1) - bic r7, r7, #3 + and r4, r10, #0x3e000000 + add r7, r7, r4, asr #23 @ vsram + ((cell&0x3e)<<1) tst r10,#0x8000 @ plane1? addne r7, r7, #2 ldrh r7, [r7] @ r7=vscroll @@ -599,6 +615,7 @@ tst r7, #0x8000 bne .DrawStrip_vs_hiprio + orr r7, r7, r10, lsl #24 @ code | (ty << 24) cmp r7, r9 beq .DrawStrip_vs_samecode @ we know stuff about this tile already @@ -694,8 +711,8 @@ @ interlace mode 2? Sonic 2? .DrawStrip_interlace: tst r0, r0 - moveq r7, r7, lsl #21 - movne r7, r7, lsl #5 + movne r7, r7, lsr #16 + mov r7, r7, lsl #21 @ Find the line in the name table add r2, r7, r2, lsl #22 @ r2=(vscroll+(DrawScanline<<1))<<21 (11 bits); @@ -790,7 +807,6 @@ add r1, r11, r4 @ r1=pdest movs r7, r6, lsl #16 - bpl .dtfc_loop @ !(code & 0x8000) cmp r5, r7, lsr #16 beq .dtfc_samecode @ if (code==prevcode) @@ -942,17 +958,23 @@ .global DrawSpritesSHi DrawSpritesSHi: - ldr r3, [r0] + ldrb r3, [r0] mov r12,#0xff ands r3, r3, #0x7f bxeq lr - stmfd sp!, {r1,r4-r11,lr} @ +est - strb r12,[r0,#2] @ set end marker - add r10,r0, #3 @ r10=HighLnSpr end + stmfd sp!, {r1,r3-r11,lr} @ +est + strb r12,[r0,#3] @ set end marker + ldrb r12,[r0,#1] + add r10,r0, #4 @ r10=HighLnSpr end + mvn r12,r12 + tst r12,#0x6 @ masking in slot 1 and tile ovfl? + ldmeqfd sp!, {r1,r3-r11,pc} add r10,r10,r3 @ r10=HighLnSpr end + ldrb r12,[r10,#0] @ width of last sprite ldr r11,[r1, #OFS_EST_HighCol] + str r12,[sp, #4] mov r12,#0xf ldr lr, [r1, #OFS_EST_PicoMem_vram] @@ -963,7 +985,7 @@ ldr r7, [sp] @ est ldr r1, [r7, #OFS_EST_HighPreSpr] cmp r0, #0xff - ldmeqfd sp!, {r1,r4-r11,pc} @ end of list + ldmeqfd sp!, {r1,r3-r11,pc} @ end of list and r0, r0, #0x7f add r0, r1, r0, lsl #3 @@ -1007,10 +1029,16 @@ and r7, r7, #7 add r8, r8, r7, lsl #1 @ tile+=(row&7)<<1; // Tile address + ldr r0, [sp, #4] + add r6, r6, #1 @ inc now + cmp r0, #0 @ check width of last sprite + movne r6, r0 + movne r0, #0 + strne r0, [sp, #4] + mov r5, r5, lsl #4 @ delta<<=4; // Delta of address mov r3, r4, lsr #9 @ r3=pal=((code>>9)&0x30); - add r6, r6, #1 @ inc now adds r0, r2, #0 @ mov sx to r0 and set ZV flags b .dsprShi_loop_enter @@ -1126,11 +1154,18 @@ @ time to do some real work stmfd sp!, {r1,r3-r11,lr} @ +sh|prio<<1 +est mov r12,#0xff - strb r12,[r0,#2] @ set end marker - add r10,r0, #3 + strb r12,[r0,#3] @ set end marker + ldrb r12,[r0,#1] + add r10,r0 ,#4 + mvn r12,r12 + tst r12,#0x6 @ masking in slot 1 and tile ovfl? + ldmeqfd sp!, {r1,r3-r11,pc} add r10,r10,r2 @ r10=HighLnSpr end + ldrb r12,[r10,#0] @ width of last sprite
View file
libretro-picodrive-0~git20200112.tar.xz/pico/m68kif_cyclone.s -> libretro-picodrive-0~git20200716.tar.xz/pico/m68kif_cyclone.s
Changed
@@ -87,19 +87,19 @@ orrcc r0, r1, r0, lsl #16 bxcc lr - stmfd sp!,{r0,r1,lr} + stmfd sp!,{r0,r1,r2,lr} mov lr, pc bx r1 mov r2, r0, lsl #16 - ldmia sp, {r0,r1} + ldmfd sp!, {r0,r1} str r2, [sp] add r0, r0, #2 mov lr, pc bx r1 - ldr r1, [sp] + ldmfd sp!, {r1,lr} mov r0, r0, lsl #16 orr r0, r1, r0, lsr #16 - ldmfd sp!,{r1,r2,pc} + bx lr cyclone_write8: @ u32 a, u8 d
View file
libretro-picodrive-0~git20200112.tar.xz/pico/memory.c -> libretro-picodrive-0~git20200716.tar.xz/pico/memory.c
Changed
@@ -163,12 +163,14 @@ m68k_write16_map[i] = (addr >> 1) | MAP_FLAG; } +#ifndef _ASM_MEMORY_C MAKE_68K_READ8(m68k_read8, m68k_read8_map) MAKE_68K_READ16(m68k_read16, m68k_read16_map) MAKE_68K_READ32(m68k_read32, m68k_read16_map) MAKE_68K_WRITE8(m68k_write8, m68k_write8_map) MAKE_68K_WRITE16(m68k_write16, m68k_write16_map) MAKE_68K_WRITE32(m68k_write32, m68k_write16_map) +#endif // ----------------------------------------------------------------- @@ -389,7 +391,7 @@ static void psg_write_68k(u32 d) { // look for volume write and update if needed - if ((d & 0x90) == 0x90 && Pico.snd.psg_line < Pico.m.scanline) + if ((d & 0x90) == 0x90) PsndDoPSG(Pico.m.scanline); SN76496Write(d); @@ -399,8 +401,7 @@ { if ((d & 0x90) == 0x90) { int scanline = get_scanline(1); - if (Pico.snd.psg_line < scanline) - PsndDoPSG(scanline); + PsndDoPSG(scanline); } SN76496Write(d); @@ -420,6 +421,7 @@ d = EEPROM_read(); if (!(a & 1)) d >>= 8; + d &= 0xff; } else d = *(u8 *)(Pico.sv.data - Pico.sv.start + a); elprintf(EL_SRAMIO, "sram r8 [%06x] %02x @ %06x", a, d, SekPc); @@ -543,7 +545,7 @@ } if ((a & 0x6000) == 0x4000) { // FM Sound if (PicoIn.opt & POPT_EN_FM) - Pico.m.status |= ym2612_write_local(a & 3, d & 0xff, 0) & 1; + ym2612_write_local(a & 3, d & 0xff, 0); return; } // TODO: probably other VDP access too? Maybe more mirrors? @@ -730,8 +732,10 @@ static void PicoWrite16_vdp(u32 a, u32 d) { - if ((a & 0x00f9) == 0x0010) // PSG Sound + if ((a & 0x00f9) == 0x0010) { // PSG Sound psg_write_68k(d); + return; + } if ((a & 0x00e0) == 0x0000) { PicoVideoWrite(a, d); return; @@ -879,7 +883,7 @@ static int get_scanline(int is_from_z80) { if (is_from_z80) { - int mclk_z80 = z80_cyclesDone() * 15; + int mclk_z80 = (z80_cyclesLeft<0 ? Pico.t.z80c_aim : z80_cyclesDone()) * 15; int mclk_line = Pico.t.z80_scanline * 488 * 7; while (mclk_z80 - mclk_line >= 488 * 7) Pico.t.z80_scanline++, mclk_line += 488 * 7; @@ -895,10 +899,10 @@ int xcycles = z80_cycles << 8; /* check for overflows */ - if ((mode_old & 4) && xcycles > Pico.t.timer_a_next_oflow) + if ((mode_old & 4) && xcycles >= Pico.t.timer_a_next_oflow) ym2612.OPN.ST.status |= 1; - if ((mode_old & 8) && xcycles > Pico.t.timer_b_next_oflow) + if ((mode_old & 8) && xcycles >= Pico.t.timer_b_next_oflow) ym2612.OPN.ST.status |= 2; /* update timer a */ @@ -940,11 +944,11 @@ a &= 3; if (a == 1 && ym2612.OPN.ST.address == 0x2a) /* DAC data */ { - int scanline = get_scanline(is_from_z80); - //elprintf(EL_STATUS, "%03i -> %03i dac w %08x z80 %i", Pico.snd.dac_line, scanline, d, is_from_z80); + int cycles = is_from_z80 ? z80_cyclesDone() : z80_cycles_from_68k(); + //elprintf(EL_STATUS, "%03i dac w %08x z80 %i", cycles, d, is_from_z80); ym2612.dacout = ((int)d - 0x80) << 6; if (ym2612.dacen) - PsndDoDAC(scanline); + PsndDoDAC(cycles); return 0; } @@ -1026,13 +1030,9 @@ return 0; } case 0x2b: { /* DAC Sel (YM2612) */ - int scanline = get_scanline(is_from_z80); - if (ym2612.dacen != (d & 0x80)) { - ym2612.dacen = d & 0x80; - Pico.snd.dac_line = scanline; - } + ym2612.dacen = d & 0x80; #ifdef __GP2X__ - if (PicoIn.opt & POPT_EXT_FM) YM2612Write_940(a, d, scanline); + if (PicoIn.opt & POPT_EXT_FM) YM2612Write_940(a, d, get_scanline(is_from_z80)); #endif return 0; } @@ -1060,6 +1060,7 @@ if (PicoIn.opt & POPT_EXT_FM) return YM2612Write_940(a, d, get_scanline(is_from_z80)); #endif + PsndDoFM(is_from_z80 ? z80_cyclesDone() : z80_cycles_from_68k()); return YM2612Write_(a, d); } @@ -1221,7 +1222,7 @@ static void z80_md_ym2612_write(unsigned int a, unsigned char data) { if (PicoIn.opt & POPT_EN_FM) - Pico.m.status |= ym2612_write_local(a, data, 1) & 1; + ym2612_write_local(a, data, 1); } static void z80_md_vdp_br_write(unsigned int a, unsigned char data)
View file
libretro-picodrive-0~git20200112.tar.xz/pico/memory.h -> libretro-picodrive-0~git20200716.tar.xz/pico/memory.h
Changed
@@ -2,13 +2,6 @@ #include "pico_port.h" -#ifndef UTYPES_DEFINED -typedef unsigned char u8; -typedef unsigned short u16; -typedef unsigned int u32; -#endif -typedef uintptr_t uptr; // unsigned pointer-sized int - #define M68K_MEM_SHIFT 16 // minimum size we can map #define M68K_BANK_SIZE (1 << M68K_MEM_SHIFT) @@ -32,8 +25,17 @@ extern u32 m68k_read8(u32 a); extern u32 m68k_read16(u32 a); +extern u32 m68k_read32(u32 a); extern void m68k_write8(u32 a, u8 d); extern void m68k_write16(u32 a, u16 d); +extern void m68k_write32(u32 a, u32 d); + +extern u32 s68k_read8(u32 a); +extern u32 s68k_read16(u32 a); +extern u32 s68k_read32(u32 a); +extern void s68k_write8(u32 a, u8 d); +extern void s68k_write16(u32 a, u16 d); +extern void s68k_write32(u32 a, u32 d); // z80 #define Z80_MEM_SHIFT 13
View file
libretro-picodrive-0~git20200112.tar.xz/pico/memory_amips.S -> libretro-picodrive-0~git20200716.tar.xz/pico/memory_amips.S
Changed
@@ -8,7 +8,7 @@ # OUT OF DATE -#include "pico_int_o32.h" +#include "pico_int_offs.h" .set noreorder .set noat
View file
libretro-picodrive-0~git20200112.tar.xz/pico/memory_arm.S -> libretro-picodrive-0~git20200716.tar.xz/pico/memory_arm.S
Changed
@@ -1,12 +1,14 @@ /* * PicoDrive * (C) notaz, 2006-2009 + * (C) kub, 2019 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. */ -#include "pico_int_o32.h" +#include "arm_features.h" +#include "pico_int_offs.h" .equ SRR_MAPPED, (1 << 0) .equ SRR_READONLY, (1 << 1) @@ -23,8 +25,10 @@ .global PicoWrite8_io .global PicoWrite16_io + PIC_LDR_INIT() + PicoRead8_sram: @ u32 a - ldr r3, =Pico + PIC_LDR(r3, r1, Pico) ldr r1, [r3, #OFS_Pico_sv_end] cmp r0, r1 bgt m_read8_nosram @@ -59,6 +63,7 @@ ldmfd sp!,{r1,lr} tst r1, #1 moveq r0, r0, lsr #8 + and r0, r0, #0xff bx lr @@ -72,7 +77,7 @@ cmp r2, #0x1000 bne PicoRead8_32x - ldr r3, =Pico + PIC_LDR(r3, r1, Pico) mov r1, r0 ldr r0, [r3, #OFS_Pico_m_rotate] add r0, r0, #1 @@ -95,7 +100,7 @@ @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ PicoRead16_sram: @ u32 a, u32 d - ldr r3, =Pico + PIC_LDR(r3, r1, Pico) ldr r1, [r3, #OFS_Pico_sv_end] cmp r0, r1 bgt m_read16_nosram @@ -140,7 +145,7 @@ cmp r2, #0x1000 bne PicoRead16_32x - ldr r3, =Pico + PIC_LDR(r3, r2, Pico) and r2, r0, #0xff00 ldr r0, [r3, #OFS_Pico_m_rotate] add r0, r0, #1 @@ -182,7 +187,7 @@ eor r2, r2, #0x003000 eors r2, r2, #0x0000f1 bne PicoWrite8_32x - ldr r3, =Pico + PIC_LDR(r3, r2, Pico) ldrb r2, [r3, #OFS_Pico_m_sram_reg] and r1, r1, #(SRR_MAPPED|SRR_READONLY) bic r2, r2, #(SRR_MAPPED|SRR_READONLY) @@ -212,7 +217,7 @@ eor r2, r2, #0x003000 eors r2, r2, #0x0000f0 bne PicoWrite16_32x - ldr r3, =Pico + PIC_LDR(r3, r2, Pico) ldrb r2, [r3, #OFS_Pico_m_sram_reg] and r1, r1, #(SRR_MAPPED|SRR_READONLY) bic r2, r2, #(SRR_MAPPED|SRR_READONLY) @@ -220,6 +225,103 @@ strb r2, [r3, #OFS_Pico_m_sram_reg] bx lr +.global m68k_read8 +.global m68k_read16 +.global m68k_read32 +.global m68k_write8 +.global m68k_write16 +.global m68k_write32 + +m68k_read8: + PIC_LDR(r3, r2, m68k_read8_map) + bic r0, r0, #0xff000000 + mov r2, r0, lsr #16 + ldr r3, [r3, r2, lsl #2] + eor r2, r0, #1 + movs r3, r3, lsl #1 + ldrccb r0, [r3, r2] + bxcc lr + bx r3 + +m68k_read16: + PIC_LDR(r3, r2, m68k_read16_map) + bic r0, r0, #0xff000000 + mov r2, r0, lsr #16 + ldr r3, [r3, r2, lsl #2] + bic r0, r0, #1 + movs r3, r3, lsl #1 + ldrcch r0, [r3, r0] + bxcc lr + bx r3 + +m68k_read32: + PIC_LDR(r3, r2, m68k_read16_map) + bic r0, r0, #0xff000000 + mov r2, r0, lsr #16 + ldr r3, [r3, r2, lsl #2] + bic r0, r0, #1 + movs r3, r3, lsl #1 + ldrcch r1, [r3, r0]! + ldrcch r0, [r3, #2] + orrcc r0, r0, r1, lsl #16 + bxcc lr + + stmfd sp!, {r0, r3, r4, lr} + mov lr, pc + bx r3 + ldmfd sp!, {r1, r3} + str r0, [sp] + add r0, r1, #2 + mov lr, pc + bx r3 + ldmfd sp!, {r1, lr} + mov r0, r0, lsl #16 + mov r1, r1, lsl #16 + orr r0, r1, r0, lsr #16 + bx lr + +m68k_write8: + PIC_LDR(r3, r2, m68k_write8_map) + bic r0, r0, #0xff000000 + mov r2, r0, lsr #16 + ldr r3, [r3, r2, lsl #2] + eor r2, r0, #1 + movs r3, r3, lsl #1 + strccb r1, [r3, r2] + bxcc lr + bx r3 + +m68k_write16: + PIC_LDR(r3, r2, m68k_write16_map) + bic r0, r0, #0xff000000 + mov r2, r0, lsr #16 + ldr r3, [r3, r2, lsl #2] + bic r0, r0, #1 + movs r3, r3, lsl #1 + strcch r1, [r3, r0] + bxcc lr + bx r3 + +m68k_write32: + PIC_LDR(r3, r2, m68k_write16_map) + bic r0, r0, #0xff000000 + mov r2, r0, lsr #16 + ldr r3, [r3, r2, lsl #2] + bic r0, r0, #1 + movs r3, r3, lsl #1 + movcc r2, r1, lsr #16 + strcch r2, [r3, r0]! + strcch r1, [r3, #2] + bxcc lr + + stmfd sp!, {r0, r1, r3, lr} + mov r1, r1, lsr #16 + mov lr, pc + bx r3 + ldmfd sp!, {r0, r1, r3, lr} + add r0, r0, #2 + bx r3 + .pool @ vim:filetype=armasm
View file
libretro-picodrive-0~git20200112.tar.xz/pico/misc.c -> libretro-picodrive-0~git20200716.tar.xz/pico/misc.c
Changed
@@ -1,6 +1,7 @@ /* * rarely used EEPROM code * (C) notaz, 2006-2008 + * (C) kub, 2020 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -8,83 +9,174 @@ #include "pico_int.h" -// H-counter table for hvcounter reads in 40col mode -// based on Gens code +// H-counter table for hvcounter reads in 40col mode, starting at HINT const unsigned char hcounts_40[] = { -0x07,0x07,0x08,0x08,0x08,0x09,0x09,0x0a,0x0a,0x0b,0x0b,0x0b,0x0c,0x0c,0x0d,0x0d, -0x0e,0x0e,0x0e,0x0f,0x0f,0x10,0x10,0x10,0x11,0x11,0x12,0x12,0x13,0x13,0x13,0x14, -0x14,0x15,0x15,0x15,0x16,0x16,0x17,0x17,0x18,0x18,0x18,0x19,0x19,0x1a,0x1a,0x1b, -0x1b,0x1b,0x1c,0x1c,0x1d,0x1d,0x1d,0x1e,0x1e,0x1f,0x1f,0x20,0x20,0x20,0x21,0x21, -0x22,0x22,0x23,0x23,0x23,0x24,0x24,0x25,0x25,0x25,0x26,0x26,0x27,0x27,0x28,0x28, -0x28,0x29,0x29,0x2a,0x2a,0x2a,0x2b,0x2b,0x2c,0x2c,0x2d,0x2d,0x2d,0x2e,0x2e,0x2f, -0x2f,0x30,0x30,0x30,0x31,0x31,0x32,0x32,0x32,0x33,0x33,0x34,0x34,0x35,0x35,0x35, -0x36,0x36,0x37,0x37,0x38,0x38,0x38,0x39,0x39,0x3a,0x3a,0x3a,0x3b,0x3b,0x3c,0x3c, -0x3d,0x3d,0x3d,0x3e,0x3e,0x3f,0x3f,0x3f,0x40,0x40,0x41,0x41,0x42,0x42,0x42,0x43, -0x43,0x44,0x44,0x45,0x45,0x45,0x46,0x46,0x47,0x47,0x47,0x48,0x48,0x49,0x49,0x4a, -0x4a,0x4a,0x4b,0x4b,0x4c,0x4c,0x4d,0x4d,0x4d,0x4e,0x4e,0x4f,0x4f,0x4f,0x50,0x50, -0x51,0x51,0x52,0x52,0x52,0x53,0x53,0x54,0x54,0x55,0x55,0x55,0x56,0x56,0x57,0x57, -0x57,0x58,0x58,0x59,0x59,0x5a,0x5a,0x5a,0x5b,0x5b,0x5c,0x5c,0x5c,0x5d,0x5d,0x5e, -0x5e,0x5f,0x5f,0x5f,0x60,0x60,0x61,0x61,0x62,0x62,0x62,0x63,0x63,0x64,0x64,0x64, -0x65,0x65,0x66,0x66,0x67,0x67,0x67,0x68,0x68,0x69,0x69,0x6a,0x6a,0x6a,0x6b,0x6b, -0x6c,0x6c,0x6c,0x6d,0x6d,0x6e,0x6e,0x6f,0x6f,0x6f,0x70,0x70,0x71,0x71,0x71,0x72, -0x72,0x73,0x73,0x74,0x74,0x74,0x75,0x75,0x76,0x76,0x77,0x77,0x77,0x78,0x78,0x79, -0x79,0x79,0x7a,0x7a,0x7b,0x7b,0x7c,0x7c,0x7c,0x7d,0x7d,0x7e,0x7e,0x7f,0x7f,0x7f, -0x80,0x80,0x81,0x81,0x81,0x82,0x82,0x83,0x83,0x84,0x84,0x84,0x85,0x85,0x86,0x86, -0x86,0x87,0x87,0x88,0x88,0x89,0x89,0x89,0x8a,0x8a,0x8b,0x8b,0x8c,0x8c,0x8c,0x8d, -0x8d,0x8e,0x8e,0x8e,0x8f,0x8f,0x90,0x90,0x91,0x91,0x91,0x92,0x92,0x93,0x93,0x94, -0x94,0x94,0x95,0x95,0x96,0x96,0x96,0x97,0x97,0x98,0x98,0x99,0x99,0x99,0x9a,0x9a, -0x9b,0x9b,0x9b,0x9c,0x9c,0x9d,0x9d,0x9e,0x9e,0x9e,0x9f,0x9f,0xa0,0xa0,0xa1,0xa1, -0xa1,0xa2,0xa2,0xa3,0xa3,0xa3,0xa4,0xa4,0xa5,0xa5,0xa6,0xa6,0xa6,0xa7,0xa7,0xa8, -0xa8,0xa9,0xa9,0xa9,0xaa,0xaa,0xab,0xab,0xab,0xac,0xac,0xad,0xad,0xae,0xae,0xae, -0xaf,0xaf,0xb0,0xb0, -0xe4,0xe4,0xe4,0xe5,0xe5,0xe6,0xe6,0xe6,0xe7,0xe7,0xe8,0xe8,0xe9,0xe9,0xe9,0xea, -0xea,0xeb,0xeb,0xeb,0xec,0xec,0xed,0xed,0xee,0xee,0xee,0xef,0xef,0xf0,0xf0,0xf1, -0xf1,0xf1,0xf2,0xf2,0xf3,0xf3,0xf3,0xf4,0xf4,0xf5,0xf5,0xf6,0xf6,0xf6,0xf7,0xf7, -0xf8,0xf8,0xf9,0xf9,0xf9,0xfa,0xfa,0xfb,0xfb,0xfb,0xfc,0xfc,0xfd,0xfd,0xfe,0xfe, -0xfe,0xff,0xff,0x00,0x00,0x00,0x01,0x01,0x02,0x02,0x03,0x03,0x03,0x04,0x04,0x05, -0x05,0x06,0x06,0x06, -0x07,0x07,0x08,0x08,0x08,0x09,0x09,0x0a,0x0a,0x0b,0x0b,0x0b,0x0c,0x0c,0x0d,0x0d, -0x0e,0x0e,0x0e,0x0f,0x0f,0x10,0x10,0x10, +0xa5,0xa6,0xa7,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xae,0xaf,0xb0,0xb1,0xb2, +0xb3,0xb4,0xb5,0xb5,0xe4,0xe5,0xe6,0xe7,0xe8,0xe8,0xe9,0xea,0xea,0xeb,0xec,0xed, +0xed,0xee,0xef,0xef,0xf0,0xf1,0xf2,0xf2,0xf3,0xf4,0xf4,0xf5,0xf6,0xf7,0xf7,0xf8, +0xf9,0xfa,0xfb,0xfc,0xfd,0xfd,0xfe,0xff,0x00,0x01,0x02,0x03,0x04,0x04,0x05,0x06, +0x07,0x08,0x09,0x0a,0x0b,0x0b,0x0c,0x0d,0x0e,0x0f,0x10,0x11,0x12,0x12,0x13,0x14, +0x15,0x16,0x17,0x18,0x19,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,0x20,0x20,0x21,0x22, +0x23,0x24,0x25,0x26,0x27,0x27,0x28,0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2e,0x2f,0x30, +0x31,0x32,0x33,0x34,0x35,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3c,0x3d,0x3e, +0x3f,0x40,0x41,0x42,0x43,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4a,0x4a,0x4b,0x4c, +0x4d,0x4e,0x4f,0x50,0x51,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x58,0x59,0x5a, +0x5b,0x5c,0x5d,0x5e,0x5f,0x5f,0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x66,0x67,0x68, +0x69,0x6a,0x6b,0x6c,0x6d,0x6d,0x6e,0x6f,0x70,0x71,0x72,0x73,0x74,0x74,0x75,0x76, +0x77,0x78,0x79,0x7a,0x7b,0x7b,0x7c,0x7d,0x7e,0x7f,0x80,0x81,0x82,0x82,0x83,0x84, +0x85,0x86,0x87,0x88,0x89,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f,0x90,0x90,0x91,0x92, +0x93,0x94,0x95,0x96,0x97,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9e,0x9f,0xa0, +0xa1,0xa2,0xa3,0xa4,0xa5,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xac,0xad,0xae, }; -// H-counter table for hvcounter reads in 32col mode +// H-counter table for hvcounter reads in 32col mode, starting at HINT const unsigned char hcounts_32[] = { -0x05,0x05,0x05,0x06,0x06,0x07,0x07,0x07,0x08,0x08,0x08,0x09,0x09,0x09,0x0a,0x0a, -0x0a,0x0b,0x0b,0x0b,0x0c,0x0c,0x0c,0x0d,0x0d,0x0d,0x0e,0x0e,0x0f,0x0f,0x0f,0x10, -0x10,0x10,0x11,0x11,0x11,0x12,0x12,0x12,0x13,0x13,0x13,0x14,0x14,0x14,0x15,0x15, -0x15,0x16,0x16,0x17,0x17,0x17,0x18,0x18,0x18,0x19,0x19,0x19,0x1a,0x1a,0x1a,0x1b, -0x1b,0x1b,0x1c,0x1c,0x1c,0x1d,0x1d,0x1d,0x1e,0x1e,0x1f,0x1f,0x1f,0x20,0x20,0x20, -0x21,0x21,0x21,0x22,0x22,0x22,0x23,0x23,0x23,0x24,0x24,0x24,0x25,0x25,0x26,0x26, -0x26,0x27,0x27,0x27,0x28,0x28,0x28,0x29,0x29,0x29,0x2a,0x2a,0x2a,0x2b,0x2b,0x2b, -0x2c,0x2c,0x2c,0x2d,0x2d,0x2e,0x2e,0x2e,0x2f,0x2f,0x2f,0x30,0x30,0x30,0x31,0x31, -0x31,0x32,0x32,0x32,0x33,0x33,0x33,0x34,0x34,0x34,0x35,0x35,0x36,0x36,0x36,0x37, -0x37,0x37,0x38,0x38,0x38,0x39,0x39,0x39,0x3a,0x3a,0x3a,0x3b,0x3b,0x3b,0x3c,0x3c, -0x3d,0x3d,0x3d,0x3e,0x3e,0x3e,0x3f,0x3f,0x3f,0x40,0x40,0x40,0x41,0x41,0x41,0x42, -0x42,0x42,0x43,0x43,0x43,0x44,0x44,0x45,0x45,0x45,0x46,0x46,0x46,0x47,0x47,0x47, -0x48,0x48,0x48,0x49,0x49,0x49,0x4a,0x4a,0x4a,0x4b,0x4b,0x4b,0x4c,0x4c,0x4d,0x4d, -0x4d,0x4e,0x4e,0x4e,0x4f,0x4f,0x4f,0x50,0x50,0x50,0x51,0x51,0x51,0x52,0x52,0x52, -0x53,0x53,0x53,0x54,0x54,0x55,0x55,0x55,0x56,0x56,0x56,0x57,0x57,0x57,0x58,0x58, -0x58,0x59,0x59,0x59,0x5a,0x5a,0x5a,0x5b,0x5b,0x5c,0x5c,0x5c,0x5d,0x5d,0x5d,0x5e, -0x5e,0x5e,0x5f,0x5f,0x5f,0x60,0x60,0x60,0x61,0x61,0x61,0x62,0x62,0x62,0x63,0x63, -0x64,0x64,0x64,0x65,0x65,0x65,0x66,0x66,0x66,0x67,0x67,0x67,0x68,0x68,0x68,0x69, -0x69,0x69,0x6a,0x6a,0x6a,0x6b,0x6b,0x6c,0x6c,0x6c,0x6d,0x6d,0x6d,0x6e,0x6e,0x6e, -0x6f,0x6f,0x6f,0x70,0x70,0x70,0x71,0x71,0x71,0x72,0x72,0x72,0x73,0x73,0x74,0x74, -0x74,0x75,0x75,0x75,0x76,0x76,0x76,0x77,0x77,0x77,0x78,0x78,0x78,0x79,0x79,0x79, -0x7a,0x7a,0x7b,0x7b,0x7b,0x7c,0x7c,0x7c,0x7d,0x7d,0x7d,0x7e,0x7e,0x7e,0x7f,0x7f, -0x7f,0x80,0x80,0x80,0x81,0x81,0x81,0x82,0x82,0x83,0x83,0x83,0x84,0x84,0x84,0x85, -0x85,0x85,0x86,0x86,0x86,0x87,0x87,0x87,0x88,0x88,0x88,0x89,0x89,0x89,0x8a,0x8a, -0x8b,0x8b,0x8b,0x8c,0x8c,0x8c,0x8d,0x8d,0x8d,0x8e,0x8e,0x8e,0x8f,0x8f,0x8f,0x90, -0x90,0x90,0x91,0x91, -0xe8,0xe8,0xe8,0xe9,0xe9,0xe9,0xea,0xea,0xea,0xeb,0xeb,0xeb,0xec,0xec,0xec,0xed, -0xed,0xed,0xee,0xee,0xee,0xef,0xef,0xf0,0xf0,0xf0,0xf1,0xf1,0xf1,0xf2,0xf2,0xf2, -0xf3,0xf3,0xf3,0xf4,0xf4,0xf4,0xf5,0xf5,0xf5,0xf6,0xf6,0xf6,0xf7,0xf7,0xf8,0xf8, -0xf8,0xf9,0xf9,0xf9,0xfa,0xfa,0xfa,0xfb,0xfb,0xfb,0xfc,0xfc,0xfc,0xfd,0xfd,0xfd, -0xfe,0xfe,0xfe,0xff,0xff,0x00,0x00,0x00,0x01,0x01,0x01,0x02,0x02,0x02,0x03,0x03, -0x03,0x04,0x04,0x04, -0x05,0x05,0x05,0x06,0x06,0x07,0x07,0x07,0x08,0x08,0x08,0x09,0x09,0x09,0x0a,0x0a, -0x0a,0x0b,0x0b,0x0b,0x0c,0x0c,0x0c,0x0d, +0x85,0x86,0x86,0x87,0x88,0x88,0x89,0x8a,0x8a,0x8b,0x8c,0x8d,0x8d,0x8e,0x8f,0x8f, +0x90,0x91,0x91,0x92,0x93,0xe9,0xe9,0xea,0xeb,0xeb,0xec,0xed,0xed,0xee,0xef,0xf0, +0xf0,0xf1,0xf2,0xf2,0xf3,0xf4,0xf4,0xf5,0xf6,0xf7,0xf7,0xf8,0xf9,0xf9,0xfa,0xfb, +0xfb,0xfc,0xfd,0xfe,0xfe,0xff,0x00,0x00,0x01,0x02,0x02,0x03,0x04,0x05,0x05,0x06, +0x07,0x07,0x08,0x09,0x09,0x0a,0x0b,0x0c,0x0c,0x0d,0x0e,0x0e,0x0f,0x10,0x10,0x11, +0x12,0x13,0x13,0x14,0x15,0x15,0x16,0x17,0x17,0x18,0x19,0x1a,0x1a,0x1b,0x1c,0x1c, +0x1d,0x1e,0x1e,0x1f,0x20,0x21,0x21,0x22,0x23,0x23,0x24,0x25,0x25,0x26,0x27,0x28, +0x28,0x29,0x2a,0x2a,0x2b,0x2c,0x2c,0x2d,0x2e,0x2f,0x2f,0x30,0x31,0x31,0x32,0x33, +0x33,0x34,0x35,0x36,0x36,0x37,0x38,0x38,0x39,0x3a,0x3a,0x3b,0x3c,0x3d,0x3d,0x3e, +0x3f,0x3f,0x40,0x41,0x41,0x42,0x43,0x44,0x44,0x45,0x46,0x46,0x47,0x48,0x48,0x49, +0x4a,0x4b,0x4b,0x4c,0x4d,0x4d,0x4e,0x4f,0x4f,0x50,0x51,0x52,0x52,0x53,0x54,0x54, +0x55,0x56,0x56,0x57,0x58,0x59,0x59,0x5a,0x5b,0x5b,0x5c,0x5d,0x5d,0x5e,0x5f,0x60, +0x60,0x61,0x62,0x62,0x63,0x64,0x64,0x65,0x66,0x67,0x67,0x68,0x69,0x69,0x6a,0x6b, +0x6b,0x6c,0x6d,0x6e,0x6e,0x6f,0x70,0x70,0x71,0x72,0x72,0x73,0x74,0x75,0x75,0x76, +0x77,0x77,0x78,0x79,0x79,0x7a,0x7b,0x7c,0x7c,0x7d,0x7e,0x7e,0x7f,0x80,0x80,0x81, +0x82,0x83,0x83,0x84,0x85,0x85,0x86,0x87,0x87,0x88,0x89,0x8a,0x8a,0x8b,0x8c,0x8c, }; +// VDP transfer slots for blanked and active display in 32col and 40col mode. +// 1 slot is 488/171 = 2.8538 68k cycles in h32, and 488/210 = 2.3238 in h40 +// In blanked display, all slots but 5(h32) / 6(h40) are usable for transfers, +// in active display only 16(h32) / 18(h40) slots can be used. + +// XXX inactive tables by slot#=cycles*maxslot#/488. should be through hv tables +// VDP transfer slots in inactive (blanked) display 32col mode. +// refresh slots: 250, 26, 58, 90, 122 -> 32, 64, 96, 128, 160 +const unsigned char vdpcyc2sl_32_bl[] = { // 68k cycles/2 to slot # +// 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 + 0, 0, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 9, 10, + 10, 11, 12, 12, 13, 14, 14, 15, 16, 17, 17, 18, 19, 19, 20, 21, + 21, 22, 23, 23, 24, 25, 25, 26, 27, 27, 28, 29, 29, 30, 31, 31, + 32, 33, 34, 34, 35, 36, 36, 37, 38, 38, 39, 40, 40, 41, 42, 42, + 43, 44, 44, 45, 46, 46, 47, 48, 48, 49, 50, 51, 51, 52, 53, 53, + 54, 55, 55, 56, 57, 57, 58, 59, 59, 60, 61, 61, 62, 63, 63, 64, + 65, 65, 66, 67, 68, 68, 69, 70, 70, 71, 72, 72, 73, 74, 74, 75, + 76, 76, 77, 78, 78, 79, 80, 80, 81, 82, 83, 83, 84, 85, 85, 86, + 87, 87, 88, 89, 89, 90, 91, 91, 92, 93, 93, 94, 95, 95, 96, 97, + 97, 98, 99,100,100,101,102,102,103,104,104,105,106,106,107,108, + 108,109,110,110,111,112,112,113,114,114,115,116,117,117,118,119, + 119,120,121,121,122,123,123,124,125,125,126,127,127,128,129,129, + 130,131,131,132,133,134,134,135,136,136,137,138,138,139,140,140, + 141,142,142,143,144,144,145,146,146,147,148,148,149,150,151,151, + 152,153,153,154,155,155,156,157,157,158,159,159,160,161,161,162, + 163,163,164,165,166,166,167,168,168,169,170,170,171,172,172,173, +}; +// VDP transfer slots in inactive (blanked) display 40col mode. +// refresh slots: 250, 26, 58, 90, 122, 154 -> 40, 72, 104, 136, 168, 200 +const unsigned char vdpcyc2sl_40_bl[] = { // 68k cycles/2 to slot # +// 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 + 0, 0, 1, 2, 3, 4, 5, 5, 6, 7, 8, 9, 10, 10, 11, 12, + 13, 14, 15, 15, 16, 17, 18, 19, 20, 20, 21, 22, 23, 24, 25, 25, + 26, 27, 28, 29, 30, 30, 31, 32, 33, 34, 35, 35, 36, 37, 38, 39, + 40, 40, 41, 42, 43, 44, 45, 45, 46, 47, 48, 49, 50, 51, 51, 52, + 53, 54, 55, 56, 56, 57, 58, 59, 60, 61, 61, 62, 63, 64, 65, 66, + 66, 67, 68, 69, 70, 71, 71, 72, 73, 74, 75, 76, 76, 77, 78, 79, + 80, 81, 81, 82, 83, 84, 85, 86, 86, 87, 88, 89, 90, 91, 91, 92, + 93, 94, 95, 96, 96, 97, 98, 99,100,101,102,102,103,104,105,106, + 107,107,108,109,110,111,112,112,113,114,115,116,117,117,118,119, + 120,121,122,122,123,124,125,126,127,127,128,129,130,131,132,132, + 133,134,135,136,137,137,138,139,140,141,142,142,143,144,145,146, + 147,147,148,149,150,151,152,153,153,154,155,156,157,158,158,159, + 160,161,162,163,163,164,165,166,167,168,168,169,170,171,172,173, + 173,174,175,176,177,178,178,179,180,181,182,183,183,184,185,186, + 187,188,188,189,190,191,192,193,193,194,195,196,197,198,198,199, + 200,201,202,203,204,204,205,206,207,208,209,209,210,211,212,213, +}; +// VDP transfer slots in active display 32col mode. Transfer slots (Hint=0): +// 11,25,40,48,56,72,80,88,104,112,120,136,144,152,167,168 +const unsigned char vdpcyc2sl_32[] = { // 68k cycles/2 to slot # +// 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, + 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, + 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, + 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, + 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, + 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, + 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, + 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, + 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, + 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, + 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, + 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, + 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, + 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, + 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, +}; +// VDP transfer slots in active display 40col mode. Transfer slots (Hint=0): +// 21,47,55,63,79,87,95,111,119,127,143,151,159,175,183,191,206,207 +const unsigned char vdpcyc2sl_40[] = { // 68k cycles/2 to slot # +// 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, // 0 + 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, // 32 + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // 64 + 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, // 96
View file
libretro-picodrive-0~git20200112.tar.xz/pico/pico.c -> libretro-picodrive-0~git20200716.tar.xz/pico/pico.c
Changed
@@ -67,6 +67,7 @@ memset(&Pico.video,0,sizeof(Pico.video)); memset(&Pico.m,0,sizeof(Pico.m)); + memset(&Pico.t,0,sizeof(Pico.t)); Pico.video.pending_ints=0; z80_reset(); @@ -78,6 +79,7 @@ Pico.video.reg[0] = Pico.video.reg[1] = 0x04; Pico.video.reg[0xc] = 0x81; Pico.video.reg[0xf] = 0x02; + PicoVideoFIFOMode(0, 1); if (PicoIn.AHW & PAHW_MCD) PicoPowerMCD(); @@ -182,8 +184,7 @@ PsndReset(); // pal must be known here // create an empty "dma" to cause 68k exec start at random frame location - if (Pico.m.dma_xfers == 0 && !(PicoIn.opt & POPT_DIS_VDP_FIFO)) - Pico.m.dma_xfers = rand() & 0x1fff; + PicoVideoFIFOWrite(rand() & 0x1fff, 0, 0, PVS_CPURD); SekFinishIdleDet(); @@ -222,51 +223,6 @@ rendstatus_old = -1; } -// this table is wrong and should be removed -// keeping it for now to compensate wrong timing elswhere, mainly for Outrunners -static const int dma_timings[] = { - 83, 166, 83, 83, // vblank: 32cell: dma2vram dma2[vs|c]ram vram_fill vram_copy - 102, 204, 102, 102, // vblank: 40cell: - 8, 16, 8, 8, // active: 32cell: - 17, 18, 9, 9 // ... -}; - -static const int dma_bsycles[] = { - (488<<8)/83, (488<<8)/166, (488<<8)/83, (488<<8)/83, - (488<<8)/102, (488<<8)/204, (488<<8)/102, (488<<8)/102, - (488<<8)/8, (488<<8)/16, (488<<8)/8, (488<<8)/8, - (488<<8)/9, (488<<8)/18, (488<<8)/9, (488<<8)/9 -}; - -// grossly inaccurate.. FIXME FIXXXMEE -PICO_INTERNAL int CheckDMA(void) -{ - int burn = 0, xfers_can, dma_op = Pico.video.reg[0x17]>>6; // see gens for 00 and 01 modes - int xfers = Pico.m.dma_xfers; - int dma_op1; - - if(!(dma_op&2)) dma_op = (Pico.video.type==1) ? 0 : 1; // setting dma_timings offset here according to Gens - dma_op1 = dma_op; - if(Pico.video.reg[12] & 1) dma_op |= 4; // 40 cell mode? - if(!(Pico.video.status&8)&&(Pico.video.reg[1]&0x40)) dma_op|=8; // active display? - xfers_can = dma_timings[dma_op]; - if(xfers <= xfers_can) - { - Pico.video.status &= ~SR_DMA; - if (!(dma_op & 2)) - burn = xfers * dma_bsycles[dma_op] >> 8; // have to be approximate because can't afford division.. - Pico.m.dma_xfers = 0; - } else { - if(!(dma_op&2)) burn = 488; - Pico.m.dma_xfers -= xfers_can; - } - - elprintf(EL_VDPDMA, "~Dma %i op=%i can=%i burn=%i [%u]", - Pico.m.dma_xfers, dma_op1, xfers_can, burn, SekCyclesDone()); - //dprintf("~aim: %i, cnt: %i", Pico.t.m68c_aim, Pico.t.m68c_cnt); - return burn; -} - #include "pico_cmn.c" /* sync z80 to 68k */ @@ -313,7 +269,7 @@ goto end; } - //if(Pico.video.reg[12]&0x2) Pico.video.status ^= 0x10; // change odd bit in interlace mode + //if(Pico.video.reg[12]&0x2) Pico.video.status ^= SR_ODD; // change odd bit in interlace mode PicoFrameStart(); PicoFrameHints(); @@ -326,7 +282,7 @@ { if (!(PicoIn.AHW & PAHW_SMS)) { PicoFrameStart(); - PicoDrawSync(223, 0); + PicoDrawSync(Pico.m.pal?239:223, 0); } else { PicoFrameDrawOnlyMS(); }
View file
libretro-picodrive-0~git20200112.tar.xz/pico/pico.h -> libretro-picodrive-0~git20200716.tar.xz/pico/pico.h
Changed
@@ -70,8 +70,10 @@ #define POPT_EN_DRC (1<<17) #define POPT_DIS_SPRITE_LIM (1<<18) #define POPT_DIS_IDLE_DET (1<<19) -#define POPT_EN_32X (1<<20) +#define POPT_EN_32X (1<<20) // x0 0000 #define POPT_EN_PWM (1<<21) +#define POPT_PWM_IRQ_OPT (1<<22) +#define POPT_DIS_FM_SSGEG (1<<23) #define PAHW_MCD (1<<0) #define PAHW_32X (1<<1) @@ -197,14 +199,16 @@ #endif void PicoDoHighPal555(int sh, int line, struct PicoEState *est); // internals -#define PDRAW_SPRITES_MOVED (1<<0) // (asm) +#define PDRAW_SPRITES_MOVED (1<<0) // SAT address modified #define PDRAW_WND_DIFF_PRIO (1<<1) // not all window tiles use same priority +#define PDRAW_PARSE_SPRITES (1<<2) // SAT needs parsing #define PDRAW_INTERLACE (1<<3) -#define PDRAW_DIRTY_SPRITES (1<<4) // (asm) +#define PDRAW_DIRTY_SPRITES (1<<4) // SAT modified #define PDRAW_SONIC_MODE (1<<5) // mid-frame palette changes for 8bit renderer #define PDRAW_PLANE_HI_PRIO (1<<6) // have layer with all hi prio tiles (mk3) #define PDRAW_SHHI_DONE (1<<7) // layer sh/hi already processed #define PDRAW_32_COLS (1<<8) // 32 column mode +#define PDRAW_BORDER_32 (1<<9) // center H32 in buffer (32 px border) extern int rendstatus_old; extern int rendlines;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/pico_cmn.c -> libretro-picodrive-0~git20200716.tar.xz/pico/pico_cmn.c
Changed
@@ -22,27 +22,30 @@ #endif // sync m68k to Pico.t.m68c_aim -static void SekSyncM68k(void) +static void SekExecM68k(int cyc_do) { - int cyc_do; - pprof_start(m68k); - pevt_log_m68k_o(EVT_RUN_START); - - while ((cyc_do = Pico.t.m68c_aim - Pico.t.m68c_cnt) > 0) { - Pico.t.m68c_cnt += cyc_do; + Pico.t.m68c_cnt += cyc_do; #if defined(EMU_C68K) - PicoCpuCM68k.cycles = cyc_do; - CycloneRun(&PicoCpuCM68k); - Pico.t.m68c_cnt -= PicoCpuCM68k.cycles; + PicoCpuCM68k.cycles = cyc_do; + CycloneRun(&PicoCpuCM68k); + Pico.t.m68c_cnt -= PicoCpuCM68k.cycles; #elif defined(EMU_M68K) - Pico.t.m68c_cnt += m68k_execute(cyc_do) - cyc_do; + Pico.t.m68c_cnt += m68k_execute(cyc_do) - cyc_do; #elif defined(EMU_F68K) - Pico.t.m68c_cnt += fm68k_emulate(&PicoCpuFM68k, cyc_do, 0) - cyc_do; + Pico.t.m68c_cnt += fm68k_emulate(&PicoCpuFM68k, cyc_do, 0) - cyc_do; #endif - } - SekCyclesLeft = 0; +} + +static void SekSyncM68k(void) +{ + int cyc_do; + pprof_start(m68k); + pevt_log_m68k_o(EVT_RUN_START); + + while ((cyc_do = Pico.t.m68c_aim - Pico.t.m68c_cnt) > 0) + SekExecM68k(cyc_do); SekTrace(0); pevt_log_m68k_o(EVT_RUN_END); @@ -52,10 +55,10 @@ static __inline void SekRunM68k(int cyc) { Pico.t.m68c_aim += cyc; + Pico.t.m68c_cnt += cyc >> 6; // refresh slowdowns cyc = Pico.t.m68c_aim - Pico.t.m68c_cnt; if (cyc <= 0) return; - Pico.t.m68c_cnt += cyc >> 6; // refresh slowdowns SekSyncM68k(); } @@ -68,28 +71,19 @@ } } -static void do_timing_hacks_as(struct PicoVideo *pv, int vdp_slots) +static void do_timing_hacks_end(struct PicoVideo *pv) { - pv->lwrite_cnt += vdp_slots - Pico.m.dma_xfers * 2; // wrong *2 - if (pv->lwrite_cnt > vdp_slots) - pv->lwrite_cnt = vdp_slots; - else if (pv->lwrite_cnt < 0) - pv->lwrite_cnt = 0; - if (Pico.m.dma_xfers) - SekCyclesBurn(CheckDMA()); + PicoVideoFIFOSync(488); } -static void do_timing_hacks_vb(void) +static void do_timing_hacks_start(struct PicoVideo *pv) { - if (unlikely(Pico.m.dma_xfers)) - SekCyclesBurn(CheckDMA()); + SekCyclesBurn(PicoVideoFIFOHint()); // prolong cpu HOLD if necessary } static int PicoFrameHints(void) { struct PicoVideo *pv = &Pico.video; - int line_sample = Pico.m.pal ? 68 : 93; - int vdp_slots = (Pico.video.reg[12] & 1) ? 18 : 16; int lines, y, lines_vis, skip; int vcnt_wrap, vcnt_adj; unsigned int cycles; @@ -150,27 +144,11 @@ } } - // get samples from sound chips - if ((y == 224 || y == line_sample) && PicoIn.sndOut) - { - cycles = SekCyclesDone(); - - if (Pico.m.z80Run && !Pico.m.z80_reset && (PicoIn.opt&POPT_EN_Z80)) - PicoSyncZ80(cycles); -#ifdef PICO_CD - if (PicoIn.AHW & PAHW_MCD) - pcd_sync_s68k(cycles, 0); -#endif -#ifdef PICO_32X - p32x_sync_sh2s(cycles); -#endif - PsndGetSamples(y); - } - // Run scanline: Pico.t.m68c_line_start = Pico.t.m68c_aim; - do_timing_hacks_as(pv, vdp_slots); + do_timing_hacks_start(pv); CPUS_RUN(CYCLES_M68K_LINE); + do_timing_hacks_end(pv); if (PicoLineHook) PicoLineHook(); pevt_log_m68k_o(EVT_NEXT_LINE); @@ -189,10 +167,6 @@ #endif } - // VDP FIFO - pv->lwrite_cnt = 0; - Pico.video.status |= SR_EMPT; - memcpy(PicoIn.padInt, PicoIn.pad, sizeof(PicoIn.padInt)); PAD_DELAY(); @@ -204,25 +178,26 @@ } pv->status |= SR_VB | PVS_VB2; // go into vblank + PicoVideoFIFOMode(pv->reg[1]&0x40, pv->reg[12]&1); // the following SekRun is there for several reasons: // there must be a delay after vblank bit is set and irq is asserted (Mazin Saga) // also delay between F bit (bit 7) is set in SR and IRQ happens (Ex-Mutants) // also delay between last H-int and V-int (Golden Axe 3) Pico.t.m68c_line_start = Pico.t.m68c_aim; - do_timing_hacks_vb(); + do_timing_hacks_start(pv); CPUS_RUN(CYCLES_M68K_VINT_LAG); pv->status |= SR_F; pv->pending_ints |= 0x20; if (pv->reg[1] & 0x20) { - Pico.t.m68c_aim = Pico.t.m68c_cnt + 11; // HACK - SekSyncM68k(); + if (Pico.t.m68c_cnt - Pico.t.m68c_aim < 60) // CPU blocked? + SekExecM68k(11); // HACK elprintf(EL_INTS, "vint: @ %06x [%u]", SekPc, SekCyclesDone()); SekInterrupt(6); } - cycles = SekCyclesDone(); + cycles = Pico.t.m68c_aim; if (Pico.m.z80Run && !Pico.m.z80_reset && (PicoIn.opt&POPT_EN_Z80)) { PicoSyncZ80(cycles); elprintf(EL_INTS, "zint"); @@ -238,12 +213,9 @@ p32x_start_blank(); #endif - // get samples from sound chips - if (y == 224 && PicoIn.sndOut) - PsndGetSamples(y); - // Run scanline: CPUS_RUN(CYCLES_M68K_LINE - CYCLES_M68K_VINT_LAG); + do_timing_hacks_end(pv); if (PicoLineHook) PicoLineHook(); pevt_log_m68k_o(EVT_NEXT_LINE); @@ -278,8 +250,9 @@ // Run scanline: Pico.t.m68c_line_start = Pico.t.m68c_aim; - do_timing_hacks_vb(); + do_timing_hacks_start(pv); CPUS_RUN(CYCLES_M68K_LINE); + do_timing_hacks_end(pv); if (PicoLineHook) PicoLineHook(); pevt_log_m68k_o(EVT_NEXT_LINE); @@ -289,18 +262,19 @@ unsigned int l = PicoIn.overclockM68k * lines / 100; while (l-- > 0) { Pico.t.m68c_cnt -= CYCLES_M68K_LINE; - do_timing_hacks_vb(); + do_timing_hacks_start(pv); SekSyncM68k(); + do_timing_hacks_end(pv); } } pv->status &= ~(SR_VB | PVS_VB2); pv->status |= ((pv->reg[1] >> 3) ^ SR_VB) & SR_VB; // forced blanking
View file
libretro-picodrive-0~git20200112.tar.xz/pico/pico_int.h -> libretro-picodrive-0~git20200716.tar.xz/pico/pico_int.h
Changed
@@ -9,7 +9,6 @@ #ifndef PICO_INTERNAL_INCLUDED #define PICO_INTERNAL_INCLUDED - #include <stdio.h> #include <string.h> #include "pico_port.h" @@ -32,6 +31,7 @@ extern "C" { #endif +#include "pico_types.h" // ----------------------- 68000 CPU ----------------------- #ifdef EMU_C68K @@ -129,9 +129,7 @@ // burn cycles while not in SekRun() and while in #define SekCyclesBurn(c) Pico.t.m68c_cnt += c -#define SekCyclesBurnRun(c) { \ - SekCyclesLeft -= c; \ -} +#define SekCyclesBurnRun(c) SekCyclesLeft -= c // note: sometimes may extend timeslice to delay an irq #define SekEndRun(after) { \ @@ -185,7 +183,7 @@ #define z80_int_assert(a) Cz80_Set_IRQ(&CZ80, 0, (a) ? ASSERT_LINE : CLEAR_LINE) #define z80_nmi() Cz80_Set_IRQ(&CZ80, IRQ_LINE_NMI, 0) -#define z80_cyclesLeft (CZ80.ICount - CZ80.ExtraCycles) +#define z80_cyclesLeft CZ80.ICount #define z80_subCLeft(c) CZ80.ICount -= c #define z80_pc() Cz80_Get_Reg(&CZ80, CZ80_PC) @@ -207,7 +205,7 @@ #define z80_cyclesDone() \ (Pico.t.z80c_aim - z80_cyclesLeft) -#define cycles_68k_to_z80(x) ((x) * 3823 >> 13) +#define cycles_68k_to_z80(x) ((x) * 3822 >> 13) // ----------------------- SH2 CPU ----------------------- @@ -229,11 +227,10 @@ # define sh2_pc(sh2) (sh2)->ppc #else # define sh2_end_run(sh2, after_) do { \ - int left_ = (signed int)(sh2)->sr >> 12; \ - if (left_ > (after_)) { \ - (sh2)->cycles_timeslice -= left_ - (after_); \ - (sh2)->sr &= 0xfff; \ - (sh2)->sr |= (after_) << 12; \ + int left_ = ((signed int)(sh2)->sr >> 12) - (after_); \ + if (left_ > 0) { \ + (sh2)->cycles_timeslice -= left_; \ + (sh2)->sr -= (left_ << 12); \ } \ } while (0) # define sh2_cycles_left(sh2) ((signed int)(sh2)->sr >> 12) @@ -241,11 +238,11 @@ # define sh2_pc(sh2) (sh2)->pc #endif -#define sh2_cycles_done(sh2) ((int)(sh2)->cycles_timeslice - sh2_cycles_left(sh2)) +#define sh2_cycles_done(sh2) (unsigned)((int)(sh2)->cycles_timeslice - sh2_cycles_left(sh2)) #define sh2_cycles_done_t(sh2) \ - ((sh2)->m68krcycles_done * 3 + sh2_cycles_done(sh2)) + (unsigned)(C_M68K_TO_SH2(sh2, (sh2)->m68krcycles_done) + sh2_cycles_done(sh2)) #define sh2_cycles_done_m68k(sh2) \ - ((sh2)->m68krcycles_done + (sh2_cycles_done(sh2) / 3)) + (unsigned)((sh2)->m68krcycles_done + C_SH2_TO_M68K(sh2, sh2_cycles_done(sh2))) #define sh2_reg(c, x) (c) ? ssh2.r[x] : msh2.r[x] #define sh2_gbr(c) (c) ? ssh2.gbr : msh2.gbr @@ -290,6 +287,11 @@ // not part of real SR #define PVS_ACTIVE (1 << 16) #define PVS_VB2 (1 << 17) // ignores forced blanking +#define PVS_CPUWR (1 << 18) // CPU write blocked by FIFO full +#define PVS_CPURD (1 << 19) // CPU read blocked by FIFO not empty +#define PVS_DMAFILL (1 << 20) // DMA fill is waiting for fill data +#define PVS_DMABG (1 << 21) // background DMA operation is running +#define PVS_FIFORUN (1 << 22) // FIFO is processing struct PicoVideo { @@ -300,13 +302,16 @@ unsigned short addr; // Read/Write address unsigned int status; // Status bits (SR) and extra flags unsigned char pending_ints; // pending interrupts: ??VH???? - signed char lwrite_cnt; // VDP write count during active display line + signed char pad1; // was VDP write count unsigned short v_counter; // V-counter unsigned short debug; // raw debug register unsigned char debug_p; // ... parsed: PVD_* unsigned char addr_u; // bit16 of .addr unsigned char hint_cnt; - unsigned char pad[0x0b]; + unsigned char pad2; + unsigned short hv_latch; // latched hvcounter value + signed int fifo_cnt; // pending xfers for current FIFO queue entry + unsigned char pad[0x04]; }; struct PicoMisc @@ -328,8 +333,8 @@ unsigned char eeprom_cycle; // EEPROM cycle number unsigned char eeprom_slave; // EEPROM slave word for X24C02 and better SRAMs unsigned char eeprom_status; - unsigned char status; // rapid_ym2612, multi_ym_updates - unsigned short dma_xfers; // 18 + unsigned char pad1; // was ym2612 status + unsigned short dma_xfers; // 18 unused (was VDP DMA transfer count) unsigned char eeprom_wb[2]; // EEPROM latch/write buffer unsigned int frame_count; // 1c for movies and idle det }; @@ -356,6 +361,8 @@ unsigned int *PicoOpt; unsigned char *Draw2FB; unsigned short HighPal[0x100]; + unsigned short SonicPal[0x100]; + int SonicPalCount; }; struct PicoMem @@ -421,11 +428,15 @@ short len_use; // adjusted int len_e_add; // for non-int samples/frame int len_e_cnt; - short dac_line; - short psg_line; + unsigned int clkl_mult; // z80 clocks per line in Q20 + unsigned int smpl_mult; // samples per line in Q16 + short dac_val, dac_val2; // last DAC sample + unsigned int dac_pos; // last DAC position in Q20 + unsigned int fm_pos; // last FM position in Q20 + unsigned int psg_pos; // last PSG position in Q16 }; -// run tools/mkoffsets pico/pico_int_o32.h if you change these +// run tools/mkoffsets pico/pico_int_offs.h if you change these // careful with savestate compat struct Pico { @@ -597,7 +608,8 @@ { unsigned char sdram[0x40000]; #ifdef DRC_SH2 - unsigned short drcblk_ram[1 << (18 - SH2_DRCBLK_RAM_SHIFT)]; + unsigned char drcblk_ram[1 << (18 - SH2_DRCBLK_RAM_SHIFT)]; + unsigned char drclit_ram[1 << (18 - SH2_DRCBLK_RAM_SHIFT)]; #endif unsigned short dram[2][0x20000/2]; // AKA fb union { @@ -605,7 +617,8 @@ unsigned char m68k_rom_bank[0x10000]; // M68K_BANK_SIZE }; #ifdef DRC_SH2 - unsigned short drcblk_da[2][1 << (12 - SH2_DRCBLK_DA_SHIFT)]; + unsigned char drcblk_da[2][1 << (12 - SH2_DRCBLK_DA_SHIFT)]; + unsigned char drclit_da[2][1 << (12 - SH2_DRCBLK_DA_SHIFT)]; #endif union { unsigned char b[0x800]; @@ -618,8 +631,8 @@ unsigned short pal[0x100]; unsigned short pal_native[0x100]; // converted to native (for renderer) signed short pwm[2*PWM_BUFF_LEN]; // PWM buffer for current frame - signed short pwm_current[2]; // current converted samples unsigned short pwm_fifo[2][4]; // [0] - current raw, others - fifo entries + unsigned pwm_index[2]; // ringbuffer index for pwm_fifo }; // area.c @@ -648,12 +661,14 @@ void PicoDrawSync(int to, int blank_last_line); void BackFill(int reg7, int sh, struct PicoEState *est); void FinalizeLine555(int sh, int line, struct PicoEState *est); +void PicoDrawSetOutBufMD(void *dest, int increment); extern int (*PicoScanBegin)(unsigned int num); extern int (*PicoScanEnd)(unsigned int num); -#define MAX_LINE_SPRITES 29 -extern unsigned char HighLnSpr[240][3 + MAX_LINE_SPRITES]; +#define MAX_LINE_SPRITES 27 // +1 last sprite width, +4 hdr; total 32 +extern unsigned char HighLnSpr[240][4+MAX_LINE_SPRITES+1]; extern void *DrawLineDestBase; extern int DrawLineDestIncrement; +extern unsigned int VdpSATCache[128]; // draw2.c void PicoDraw2Init(void); @@ -723,7 +738,7 @@ extern struct PicoMem PicoMem; extern void (*PicoResetHook)(void); extern void (*PicoLineHook)(void); -PICO_INTERNAL int CheckDMA(void); +PICO_INTERNAL int CheckDMA(int cycles); PICO_INTERNAL void PicoDetectRegion(void);
View file
libretro-picodrive-0~git20200112.tar.xz/pico/pico_port.h -> libretro-picodrive-0~git20200716.tar.xz/pico/pico_port.h
Changed
@@ -17,10 +17,12 @@ #define NOINLINE __attribute__((noinline)) #define ALIGNED(n) __attribute__((aligned(n))) #define unlikely(x) __builtin_expect((x), 0) +#define likely(x) __builtin_expect(!!(x), 1) #else #define NOINLINE #define ALIGNED(n) #define unlikely(x) (x) +#define likely(x) (x) #endif #ifdef _MSC_VER
View file
libretro-picodrive-0~git20200716.tar.xz/pico/pico_types.h
Added
@@ -0,0 +1,19 @@ +#ifndef PICO_TYPES +#define PICO_TYPES + +#include <stdint.h> + +#ifndef __TAMTYPES_H__ +#ifndef UTYPES_DEFINED +typedef uint8_t u8; +typedef int8_t s8; +typedef uint16_t u16; +typedef int16_t s16; +typedef uint32_t u32; +typedef int32_t s32; +#endif +#endif + +typedef uintptr_t uptr; /* unsigned pointer-sized int */ + +#endif
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sms.c -> libretro-picodrive-0~git20200716.tar.xz/pico/sms.c
Changed
@@ -46,8 +46,8 @@ struct PicoVideo *pv = &Pico.video; if (pv->type == 3) { + if (PicoMem.cram[pv->addr & 0x1f] != d) Pico.m.dirtyPal = 1; PicoMem.cram[pv->addr & 0x1f] = d; - Pico.m.dirtyPal = 1; } else { PicoMem.vramb[pv->addr] = d; } @@ -152,7 +152,7 @@ case 0x40: case 0x41: - if ((d & 0x90) == 0x90 && Pico.snd.psg_line < Pico.m.scanline) + if ((d & 0x90) == 0x90) PsndDoPSG(Pico.m.scanline); SN76496Write(d); break; @@ -320,16 +320,12 @@ } } - // 224 because of how it's done for MD... - if (y == 224 && PicoIn.sndOut) - PsndGetSamplesMS(); - cycles_aim += cycles_line; cycles_done += z80_run((cycles_aim - cycles_done) >> 8) << 8; } - if (PicoIn.sndOut && Pico.snd.psg_line < lines) - PsndDoPSG(lines - 1); + if (PicoIn.sndOut) + PsndGetSamplesMS(lines); } void PicoFrameDrawOnlyMS(void)
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/mix.c -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/mix.c
Changed
@@ -6,32 +6,72 @@ * See COPYING file in the top-level directory. */ +#include "string.h" + #define MAXOUT (+32767) #define MINOUT (-32768) /* limitter */ -#define Limit(val, max,min) { \ - if ( val > max ) val = max; \ - else if ( val < min ) val = min; \ -} +#define Limit16(val) \ + if ((short)val != val) val = (val < 0 ? MINOUT : MAXOUT) +int mix_32_to_16l_level; -void mix_32_to_16l_stereo(short *dest, int *src, int count) +static struct iir2 { // 2-pole IIR + int x[2]; // sample buffer + int y[2]; // filter intermediates + int i; +} lfi2, rfi2; + +// NB ">>" rounds to -infinity, "/" to 0. To compensate the effect possibly use +// "-(-y>>n)" (round to +infinity) instead of "y>>n" in places. + +// NB uses Q12 fixpoint; samples mustn't have more than 20 bits for this. +#define QB 12 + + +// exponential moving average filter for DC filtering +// y[n] = (x[n]-y[n-1])*(1/8192) (corner approx. 20Hz, gain 1) +static inline int filter_exp(struct iir2 *fi2, int x) { - int l, r; + int xf = (x<<QB) - fi2->y[0]; + fi2->y[0] += xf >> 13; + xf -= xf >> 2; // level reduction to avoid clipping from overshoot + return xf>>QB; +} - for (; count > 0; count--) - { - l = r = *dest; - l += *src++; - r += *src++; - Limit( l, MAXOUT, MINOUT ); - Limit( r, MAXOUT, MINOUT ); - *dest++ = l; - *dest++ = r; - } +// unfiltered (for testing) +static inline int filter_null(struct iir2 *fi2, int x) +{ + return x; +} + +#define mix_32_to_16l_stereo_core(dest, src, count, lv, fl) { \ + int l, r; \ + \ + for (; count > 0; count--) \ + { \ + l = r = *dest; \ + l += *src++ >> lv; \ + r += *src++ >> lv; \ + l = fl(&lfi2, l); \ + r = fl(&rfi2, r); \ + Limit16(l); \ + Limit16(r); \ + *dest++ = l; \ + *dest++ = r; \ + } \ +} + +void mix_32_to_16l_stereo_lvl(short *dest, int *src, int count) +{ + mix_32_to_16l_stereo_core(dest, src, count, mix_32_to_16l_level, filter_exp); } +void mix_32_to_16l_stereo(short *dest, int *src, int count) +{ + mix_32_to_16l_stereo_core(dest, src, count, 0, filter_exp); +} void mix_32_to_16_mono(short *dest, int *src, int count) { @@ -41,7 +81,8 @@ { l = *dest; l += *src++; - Limit( l, MAXOUT, MINOUT ); + l = filter_exp(&lfi2, l); + Limit16(l); *dest++ = l; } } @@ -77,3 +118,8 @@ } } +void mix_reset(void) +{ + memset(&lfi2, 0, sizeof(lfi2)); + memset(&rfi2, 0, sizeof(rfi2)); +}
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/mix.h -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/mix.h
Changed
@@ -8,3 +8,4 @@ extern int mix_32_to_16l_level; void mix_32_to_16l_stereo_lvl(short *dest, int *src, int count); +void mix_reset(void);
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/mix_arm.S -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/mix_arm.S
Changed
@@ -166,13 +166,6 @@ @ limit and shift up by 16 @ reg=int_sample, lr=1, r3=tmp, kills flags .macro Limitsh reg -@ movs r4, r3, asr #16 -@ cmnne r4, #1 -@ beq c32_16_no_overflow -@ tst r4, r4 -@ mov r3, #0x8000 -@ subpl r3, r3, #1 - add r3, lr, \reg, asr #15 bics r3, r3, #1 @ in non-overflow conditions r3 is 0 or 1 moveq \reg, \reg, lsl #16 @@ -180,20 +173,30 @@ subpl \reg, \reg, #0x00010000 .endm +@ filter out DC offset +@ in=int_sample (max 20 bit), y=filter memory, r3=tmp +.macro DCfilt in y + rsb r3, \y, \in, lsl #12 @ fixpoint 20.12 + add \y, \y, r3, asr #13 + sub r3, r3, r3, asr #2 @ reduce audio lvl some + asr \in, r3, #12 +.endm @ mix 32bit audio (with 16bits really used, upper bits indicate overflow) with normal 16 bit audio with left channel only @ warning: this function assumes dest is word aligned .global mix_32_to_16l_stereo @ short *dest, int *src, int count mix_32_to_16l_stereo: - stmfd sp!, {r4-r8,lr} - - mov lr, #1 + stmfd sp!, {r4-r8,r10-r11,lr} mov r2, r2, lsl #1 subs r2, r2, #4 bmi m32_16l_st_end + mov lr, #1 + ldr r12, =filter + ldmia r12, {r10-r11} + m32_16l_st_loop: ldmia r0, {r8,r12} ldmia r1!, {r4-r7} @@ -203,6 +206,10 @@ add r5, r5, r8, asr #16 add r6, r6, r12,asr #16 add r7, r7, r12,asr #16 + DCfilt r4, r10 + DCfilt r5, r11 + DCfilt r6, r10 + DCfilt r7, r11 Limitsh r4 Limitsh r5 Limitsh r6 @@ -221,13 +228,17 @@ ldmia r1!,{r4,r5} add r4, r4, r6 add r5, r5, r6 + DCfilt r4, r10 + DCfilt r5, r11 Limitsh r4 Limitsh r5 orr r4, r5, r4, lsr #16 str r4, [r0], #4 m32_16l_st_no_unal2: - ldmfd sp!, {r4-r8,lr} + ldr r12, =filter + stmia r12, {r10-r11} + ldmfd sp!, {r4-r8,r10-r11,lr} bx lr @@ -235,9 +246,11 @@ .global mix_32_to_16_mono @ short *dest, int *src, int count mix_32_to_16_mono: - stmfd sp!, {r4-r8,lr} + stmfd sp!, {r4-r8,r10-r11,lr} mov lr, #1 + ldr r12, =filter + ldr r10, [r12] @ check if dest is word aligned tst r0, #2 @@ -262,6 +275,10 @@ add r7, r7, r12,asr #16 mov r12,r12,lsl #16 add r6, r6, r12,asr #16 + DCfilt r4, r10 + DCfilt r5, r10 + DCfilt r6, r10 + DCfilt r7, r10 Limitsh r4 Limitsh r5 Limitsh r6 @@ -281,6 +298,8 @@ add r5, r5, r6, asr #16 mov r6, r6, lsl #16 add r4, r4, r6, asr #16 + DCfilt r4, r10 + DCfilt r5, r10 Limitsh r4 Limitsh r5 orr r4, r5, r4, lsr #16 @@ -288,14 +307,18 @@ m32_16_mo_no_unal2: tst r2, #1 - ldmeqfd sp!, {r4-r8,pc} + beq m32_16_mo_no_unal ldrsh r5, [r0] ldr r4, [r1], #4 add r4, r4, r5 + DCfilt r4, r10 Limit r4 strh r4, [r0], #2 - ldmfd sp!, {r4-r8,lr} +m32_16_mo_no_unal: + ldr r12, =filter + str r10, [r12] + ldmfd sp!, {r4-r8,r10-r11,lr} bx lr @@ -315,11 +338,13 @@ .global mix_32_to_16l_stereo_lvl @ short *dest, int *src, int count mix_32_to_16l_stereo_lvl: - stmfd sp!, {r4-r9,lr} + stmfd sp!, {r4-r11,lr} ldr r9, =mix_32_to_16l_level mov lr, #1 ldr r9, [r9] + ldr r12, =filter + ldm r12, {r10-r11} mov r2, r2, lsl #1 subs r2, r2, #4 @@ -338,6 +363,10 @@ mov r5, r5, asr r9 mov r6, r6, asr r9 mov r7, r7, asr r9 + DCfilt r4, r10 + DCfilt r5, r11 + DCfilt r6, r10 + DCfilt r7, r11 Limitsh r4 Limitsh r5 Limitsh r6 @@ -358,15 +387,31 @@ add r5, r5, r6 mov r4, r4, asr r9 mov r5, r5, asr r9 + DCfilt r4, r10 + DCfilt r5, r11 Limitsh r4 Limitsh r5 orr r4, r5, r4, lsr #16 str r4, [r0], #4 m32_16l_st_l_no_unal2: - ldmfd sp!, {r4-r9,lr} + ldr r12, =filter + stmia r12, {r10-r11} + ldmfd sp!, {r4-r11,lr} bx lr #endif /* __GP2X__ */ +.global mix_reset @ void +mix_reset: + ldr r0, =filter + mov r1, #0 + str r1, [r0] + str r1, [r0, #4] + bx lr + +.data +filter: + .ds 8 + @ vim:filetype=armasm
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/sn76496.c -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/sn76496.c
Changed
@@ -173,9 +173,12 @@ /* If we exit the loop in the middle, Output[i] has to be inverted */ /* and vol[i] incremented only if the exit status of the square */ /* wave is 1. */ + left = 0; while (R->Count[i] <= 0) { - R->Count[i] += R->Period[i]; + if (R->Count[i] + R->Period[i]*4 < R->Period[i]) + left+= 4, R->Count[i] += R->Period[i]*4; + else left++, R->Count[i] += R->Period[i]; if (R->Count[i] > 0) { R->Output[i] ^= 1; @@ -186,6 +189,9 @@ vol[i] += R->Period[i]; } if (R->Output[i]) vol[i] -= R->Count[i]; + /* Cut of anything above the sample freqency. It will only create */ + /* aliasing and hearable distortions anyway. */ + if (left > 1) vol[i] = STEP/2; } left = STEP;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/sound.c -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/sound.c
Changed
@@ -19,9 +19,6 @@ // master int buffer to mix to static int PsndBuffer[2*(44100+100)/50]; -// dac, psg -static unsigned short dac_info[312+4]; // pos in sample buffer - // cdda output buffer short cdda_out_buffer[2*1152]; @@ -95,58 +92,6 @@ void (*low_pass_filter)(int *buf32, int length) = low_pass_filter_stereo; -static void dac_recalculate(void) -{ - int lines = Pico.m.pal ? 313 : 262; - int mid = Pico.m.pal ? 68 : 93; - int i, dac_cnt, pos, len; - - if (Pico.snd.len <= lines) - { - // shrinking algo - dac_cnt = -Pico.snd.len; - len=1; pos=0; - dac_info[225] = 1; - - for(i=226; i != 225; i++) - { - if (i >= lines) i = 0; - if(dac_cnt < 0) { - pos++; - dac_cnt += lines; - } - dac_cnt -= Pico.snd.len; - dac_info[i] = pos; - } - } - else - { - // stretching - dac_cnt = Pico.snd.len; - pos=0; - for(i = 225; i != 224; i++) - { - if (i >= lines) i = 0; - len=0; - while(dac_cnt >= 0) { - dac_cnt -= lines; - len++; - } - if (i == mid) // midpoint - while(pos+len < Pico.snd.len/2) { - dac_cnt -= lines; - len++; - } - dac_cnt += Pico.snd.len; - pos += len; - dac_info[i] = pos; - } - } - for (i = lines; i < sizeof(dac_info) / sizeof(dac_info[0]); i++) - dac_info[i] = dac_info[0]; -} - - PICO_INTERNAL void PsndReset(void) { // PsndRerate calls YM2612Init, which also resets @@ -156,6 +101,8 @@ // Reset low pass filter lpf_lp = 0; lpf_rp = 0; + + mix_reset(); } @@ -164,6 +111,7 @@ { void *state = NULL; int target_fps = Pico.m.pal ? 50 : 60; + int target_lines = Pico.m.pal ? 313 : 262; if (preserve_state) { state = malloc(0x204); @@ -171,7 +119,7 @@ ym2612_pack_state(); memcpy(state, YM2612GetRegs(), 0x204); } - YM2612Init(Pico.m.pal ? OSC_PAL/7 : OSC_NTSC/7, PicoIn.sndRate); + YM2612Init(Pico.m.pal ? OSC_PAL/7 : OSC_NTSC/7, PicoIn.sndRate, !(PicoIn.opt&POPT_DIS_FM_SSGEG)); if (preserve_state) { // feed it back it's own registers, just like after loading state memcpy(YM2612GetRegs(), state, 0x204); @@ -188,10 +136,12 @@ // calculate Pico.snd.len Pico.snd.len = PicoIn.sndRate / target_fps; Pico.snd.len_e_add = ((PicoIn.sndRate - Pico.snd.len * target_fps) << 16) / target_fps; - Pico.snd.len_e_cnt = 0; + Pico.snd.len_e_cnt = 0; // Q16 - // recalculate dac info - dac_recalculate(); + // samples per line (Q16) + Pico.snd.smpl_mult = 65536LL * PicoIn.sndRate / (target_fps*target_lines); + // samples per z80 clock (Q20) + Pico.snd.clkl_mult = 16 * Pico.snd.smpl_mult * 15/7 / 488; // clear all buffers memset32(PsndBuffer, 0, sizeof(PsndBuffer)/4); @@ -219,59 +169,61 @@ Pico.snd.len_e_cnt -= 0x10000; Pico.snd.len_use++; } - - Pico.snd.dac_line = Pico.snd.psg_line = 0; - Pico.m.status &= ~1; - dac_info[224] = Pico.snd.len_use; } -PICO_INTERNAL void PsndDoDAC(int line_to) +PICO_INTERNAL void PsndDoDAC(int cyc_to) { - int pos, pos1, len; + int pos, len; int dout = ym2612.dacout; - int line_from = Pico.snd.dac_line; - if (line_to >= 313) - line_to = 312; + // number of samples to fill in buffer (Q20) + len = (cyc_to * Pico.snd.clkl_mult) - Pico.snd.dac_pos; - pos = dac_info[line_from]; - pos1 = dac_info[line_to + 1]; - len = pos1 - pos; + // update position and calculate buffer offset and length + pos = (Pico.snd.dac_pos+0x80000) >> 20; + Pico.snd.dac_pos += len; + len = ((Pico.snd.dac_pos+0x80000) >> 20) - pos; + + // avoid loss of the 1st sample of a new block (Q rounding issues) + if (pos+len == 0) + len = 1, Pico.snd.dac_pos += 0x80000; if (len <= 0) return; - Pico.snd.dac_line = line_to + 1; - if (!PicoIn.sndOut) return; + // fill buffer, applying a rather weak order 1 bessel IIR on the way + // y[n] = (x[n] + x[n-1])*(1/2) (3dB cutoff at 11025 Hz, no gain) + // 1 sample delay for correct IIR filtering over audio frame boundaries if (PicoIn.opt & POPT_EN_STEREO) { short *d = PicoIn.sndOut + pos*2; - for (; len > 0; len--, d+=2) *d += dout; + // left channel only, mixed ro right channel in mixing phase + *d++ += Pico.snd.dac_val2; d++; + while (--len) *d++ += Pico.snd.dac_val, d++; } else { short *d = PicoIn.sndOut + pos; - for (; len > 0; len--, d++) *d += dout; + *d++ += Pico.snd.dac_val2; + while (--len) *d++ += Pico.snd.dac_val; } + Pico.snd.dac_val2 = (Pico.snd.dac_val + dout) >> 1; + Pico.snd.dac_val = dout; } PICO_INTERNAL void PsndDoPSG(int line_to) { - int line_from = Pico.snd.psg_line; - int pos, pos1, len; + int pos, len; int stereo = 0; - if (line_to >= 313) - line_to = 312; - - pos = dac_info[line_from]; - pos1 = dac_info[line_to + 1]; - len = pos1 - pos; - //elprintf(EL_STATUS, "%3d %3d %3d %3d %3d", - // pos, pos1, len, line_from, line_to); + // Q16, number of samples since last call + len = ((line_to+1) * Pico.snd.smpl_mult) - Pico.snd.psg_pos; if (len <= 0) return; - Pico.snd.psg_line = line_to + 1; + // update position and calculate buffer offset and length + pos = (Pico.snd.psg_pos+0x8000) >> 16; + Pico.snd.psg_pos += len; + len = ((Pico.snd.psg_pos+0x8000) >> 16) - pos; if (!PicoIn.sndOut || !(PicoIn.opt & POPT_EN_PSG)) return;
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/ym2612.c -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/ym2612.c
Changed
@@ -5,6 +5,10 @@ ** ** SSG-EG was also removed, because it's rarely used, Sega2.doc even does not ** document it ("proprietary") and tells to write 0 to SSG-EG control register. +** +** updated with fixes from mame 0.216 (file version 1.5.1) (kub) +** SSG-EG readded from GenPlus (kub) +** linear sample interpolation for chip to output rate adaption (kub) */ /* @@ -124,7 +128,7 @@ #endif -void memset32(int *dest, int c, int count); +void memset32(void *dest, int c, int count); #ifndef __GNUC__ @@ -148,7 +152,7 @@ #define FREQ_SH 16 /* 16.16 fixed point (frequency calculations) */ #define EG_SH 16 /* 16.16 fixed point (envelope generator timing) */ -#define LFO_SH 25 /* 7.25 fixed point (LFO calculations) */ +#define LFO_SH 24 /* 8.24 fixed point (LFO calculations) */ #define TIMER_SH 16 /* 16.16 fixed point (timers calculations) */ #define ENV_BITS 10 @@ -172,16 +176,6 @@ #define EG_TIMER_OVERFLOW (3*(1<<EG_SH)) /* envelope generator timer overflows every 3 samples (on real chip) */ -#define MAXOUT (+32767) -#define MINOUT (-32768) - -/* limitter */ -#define Limit(val, max,min) { \ - if ( val > max ) val = max; \ - else if ( val < min ) val = min; \ -} - - /* TL_TAB_LEN is calculated as: * 13 - sinus amplitude bits (Y axis) * 2 - sinus sign bit (Y axis) @@ -287,7 +281,7 @@ O(18),O(18),O(18),O(18),O(18),O(18),O(18),O(18), /* rates 00-11 */ -O( 0),O( 1),O( 2),O( 3), +O(18),O(18),O( 2),O( 3), O( 0),O( 1),O( 2),O( 3), O( 0),O( 1),O( 2),O( 3), O( 0),O( 1),O( 2),O( 3), @@ -328,10 +322,10 @@ #define O(a) (a*1) static const UINT8 eg_rate_shift[32+64+32]={ /* Envelope Generator counter shifts (32 + 64 rates + 32 RKS) */ /* 32 infinite time rates */ -O(0),O(0),O(0),O(0),O(0),O(0),O(0),O(0), -O(0),O(0),O(0),O(0),O(0),O(0),O(0),O(0), -O(0),O(0),O(0),O(0),O(0),O(0),O(0),O(0), -O(0),O(0),O(0),O(0),O(0),O(0),O(0),O(0), +O(11),O(11),O(11),O(11),O(11),O(11),O(11),O(11), +O(11),O(11),O(11),O(11),O(11),O(11),O(11),O(11), +O(11),O(11),O(11),O(11),O(11),O(11),O(11),O(11), +O(11),O(11),O(11),O(11),O(11),O(11),O(11),O(11), /* rates 00-11 */ O(11),O(11),O(11),O(11), @@ -517,7 +511,7 @@ but LFO works with one more bit of a precision so we really need 4096 elements */ static UINT32 fn_table[4096]; /* fnumber->increment counter */ -static int g_lfo_ampm = 0; +static int g_lfo_ampm; /* register number to channel number , slot offset */ #define OPN_CHAN(N) (N&3) @@ -552,6 +546,13 @@ ym2612.OPN.ST.status &= ~1; } +INLINE void recalc_volout(FM_SLOT *SLOT) +{ + INT16 vol_out = SLOT->volume; + if ((SLOT->ssg&0x0c) == 0x0c) + vol_out = (0x200 - SLOT->volume) & MAX_ATT_INDEX; + SLOT->vol_out = vol_out + SLOT->tl; +} INLINE void FM_KEYON(int c , int s ) { @@ -560,7 +561,15 @@ { SLOT->key = 1; SLOT->phase = 0; /* restart Phase Generator */ - SLOT->state = EG_ATT; /* phase -> Attack */ + SLOT->ssg ^= SLOT->ssgn; + SLOT->ssgn = 0; + SLOT->state = (SLOT->sl == MIN_ATT_INDEX) ? EG_SUS : EG_DEC; + if (SLOT->ar_ksr < 32+62) { + if (SLOT->volume > MIN_ATT_INDEX) SLOT->state = EG_ATT; + } else { + SLOT->volume = MIN_ATT_INDEX; + } +// recalc_volout(SLOT); ym2612.slot_mask |= (1<<s) << (c*4); } } @@ -571,8 +580,18 @@ if( SLOT->key ) { SLOT->key = 0; - if (SLOT->state>EG_REL) + if (SLOT->state>EG_REL) { SLOT->state = EG_REL;/* phase -> Release */ + if (SLOT->ssg&0x08) { + if (SLOT->ssg&0x04) + SLOT->volume = (0x200 - SLOT->volume); + if (SLOT->volume >= 0x200) { + SLOT->volume = MAX_ATT_INDEX; + SLOT->state = EG_OFF; + } + } + } + SLOT->vol_out = SLOT->volume + SLOT->tl; } } @@ -589,38 +608,38 @@ INLINE void set_tl(FM_SLOT *SLOT, int v) { SLOT->tl = (v&0x7f)<<(ENV_BITS-7); /* 7bit TL */ +// if (SLOT->state > EG_REL) +// recalc_volout(SLOT); } /* set attack rate & key scale */ INLINE void set_ar_ksr(FM_CH *CH, FM_SLOT *SLOT, int v) { UINT8 old_KSR = SLOT->KSR; + int eg_sh_ar, eg_sel_ar; SLOT->ar = (v&0x1f) ? 32 + ((v&0x1f)<<1) : 0; + SLOT->ar_ksr = SLOT->ar + SLOT->ksr; SLOT->KSR = 3-(v>>6); if (SLOT->KSR != old_KSR) { CH->SLOT[SLOT1].Incr=-1; } + + /* refresh Attack rate */ + if ((SLOT->ar_ksr) < 32+62) + { + eg_sh_ar = eg_rate_shift [SLOT->ar_ksr]; + eg_sel_ar = eg_rate_select[SLOT->ar_ksr]; + } else { - int eg_sh_ar, eg_sel_ar; - - /* refresh Attack rate */ - if ((SLOT->ar + SLOT->ksr) < 32+62) - { - eg_sh_ar = eg_rate_shift [SLOT->ar + SLOT->ksr ]; - eg_sel_ar = eg_rate_select[SLOT->ar + SLOT->ksr ]; - } - else - { - eg_sh_ar = 0; - eg_sel_ar = 17; - } - - SLOT->eg_pack_ar = eg_inc_pack[eg_sel_ar] | (eg_sh_ar<<24); + eg_sh_ar = 0; + eg_sel_ar = 18; } + + SLOT->eg_pack_ar = eg_inc_pack[eg_sel_ar] | (eg_sh_ar<<24); } /* set decay rate */ @@ -656,6 +675,9 @@ SLOT->sl = sl_table[ v>>4 ]; + if (SLOT->state == EG_DEC && (SLOT->volume >= (INT32)(SLOT->sl))) + SLOT->state = EG_SUS; + SLOT->rr = 34 + ((v&0x0f)<<2); eg_sh_rr = eg_rate_shift [SLOT->rr + SLOT->ksr]; @@ -715,12 +737,12 @@ if (prev_pos != pos) { lfo_ampm &= 0xff; - /* triangle */ + /* triangle (inverted) */
View file
libretro-picodrive-0~git20200112.tar.xz/pico/sound/ym2612.h -> libretro-picodrive-0~git20200716.tar.xz/pico/sound/ym2612.h
Changed
@@ -53,6 +53,12 @@ }; UINT32 eg_pack[4]; }; + + UINT8 ssg; /* 0x30 SSG-EG waveform */ + UINT8 ssgn; + UINT16 ar_ksr; /* 0x32 ar+ksr */ + UINT16 vol_out; /* 0x34 current output from EG (without LFO) */ + UINT16 vol_ipol; /* 0x36 interpolator memory */ } FM_SLOT; @@ -89,7 +95,7 @@ UINT8 address; /* 10 address register | need_save */ UINT8 status; /* 11 status flag | need_save */ UINT8 mode; /* mode CSM / 3SLOT */ - UINT8 pad; + UINT8 flags; /* operational flags */ int TA; /* timer a */ int TAC; /* timer a maxval */ int TAT; /* timer a ticker | need_save */ @@ -147,6 +153,7 @@ FM_OPN OPN; /* OPN state */ UINT32 slot_mask; /* active slot mask (performance hack) */ + UINT32 ssg_mask; /* active ssg mask (performance hack) */ } YM2612; #endif @@ -154,7 +161,7 @@ extern YM2612 ym2612; #endif -void YM2612Init_(int baseclock, int rate); +void YM2612Init_(int baseclock, int rate, int ssg); void YM2612ResetChip_(void); int YM2612UpdateOne_(int *buffer, int length, int stereo, int is_buf_empty); @@ -176,22 +183,22 @@ #else /* GP2X specific */ #include "../../platform/gp2x/940ctl.h" -extern int PicoIn.opt; -#define YM2612Init(baseclock,rate) { \ - if (PicoIn.opt&0x200) YM2612Init_940(baseclock, rate); \ - else YM2612Init_(baseclock, rate); \ -} -#define YM2612ResetChip() { \ - if (PicoIn.opt&0x200) YM2612ResetChip_940(); \ +#define YM2612Init(baseclock,rate,ssg) do { \ + if (PicoIn.opt&POPT_EXT_FM) YM2612Init_940(baseclock, rate, ssg); \ + else YM2612Init_(baseclock, rate, ssg); \ +} while (0) +#define YM2612ResetChip() do { \ + if (PicoIn.opt&POPT_EXT_FM) YM2612ResetChip_940(); \ else YM2612ResetChip_(); \ -} -#define YM2612UpdateOne(buffer,length,stereo,is_buf_empty) \ - (PicoIn.opt&0x200) ? YM2612UpdateOne_940(buffer, length, stereo, is_buf_empty) : \ - YM2612UpdateOne_(buffer, length, stereo, is_buf_empty); -#define YM2612PicoStateLoad() { \ - if (PicoIn.opt&0x200) YM2612PicoStateLoad_940(); \ +} while (0) +#define YM2612UpdateOne(buffer,length,stereo,is_buf_empty) do { \ + (PicoIn.opt&POPT_EXT_FM) ? YM2612UpdateOne_940(buffer, length, stereo, is_buf_empty) : \ + YM2612UpdateOne_(buffer, length, stereo, is_buf_empty); \ +} while (0) +#define YM2612PicoStateLoad() do { \ + if (PicoIn.opt&POPT_EXT_FM) YM2612PicoStateLoad_940(); \ else YM2612PicoStateLoad_(); \ -} +} while (0) #endif /* __GP2X__ */
View file
libretro-picodrive-0~git20200716.tar.xz/pico/sound/ym2612_arm.S
Added
@@ -0,0 +1,976 @@ +/* + * PicoDrive + * (C) notaz, 2006 + * (C) kub, 2020 added SSG-EG and simple output rate interpolation + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ + +@ this is a rewrite of MAME's ym2612 code, in particular this is only the main sample-generatin loop. +@ it does not seem to give much performance increase (if any at all), so don't use it if it causes trouble. +@ - notaz, 2006 + +@ vim:filetype=armasm + +#include "../arm_features.h" + +@ very simple YM2612 output rate to sample rate adaption (~500k cycles @44100) +#define INTERPOL +#define SSG_EG + +.equiv SLOT1, 0 +.equiv SLOT2, 2 +.equiv SLOT3, 1 +.equiv SLOT4, 3 +.equiv SLOT_STRUCT_SIZE, 0x38 + +.equiv TL_TAB_LEN, 0x1A00 + +.equiv EG_ATT, 4 +.equiv EG_DEC, 3 +.equiv EG_SUS, 2 +.equiv EG_REL, 1 +.equiv EG_OFF, 0 + +.equiv EG_SH, 16 @ 16.16 fixed point (envelope generator timing) +.equiv EG_TIMER_OVERFLOW, (3*(1<<EG_SH)) @ envelope generator timer overflows every 3 samples (on real chip) +.equiv LFO_SH, 24 /* 8.24 fixed point (LFO calculations) */ + +.equiv ENV_QUIET, (2*13*256/8) + +.text +.align 2 + PIC_LDR_INIT() + +@ r5=slot, r1=eg_cnt, trashes: r0,r2,r3 +@ writes output to routp, but only if vol_out changes +.macro update_eg_phase_slot +#if defined(INTERPOL) + ldrh r0, [r5,#0x34] @ vol_out +#endif + ldrb r2, [r5,#0x17] @ state + add r3, r5, #0x1c +#if defined(INTERPOL) + strh r0, [r5,#0x36] @ vol_ipol +#endif + tst r2, r2 + beq 0f @ EG_OFF + + ldr r2, [r3, r2, lsl #2] @ pack + mov r3, #1 + mov r0, r2, lsr #24 @ shift + mov r3, r3, lsl r0 + sub r3, r3, #1 + + tst r1, r3 + bne 0f @ no volume change + + mov r3, r1, lsr r0 + ldrb r0, [r5,#0x30] @ ssg + and r3, r3, #7 + add r3, r3, r3, lsl #1 + mov r3, r2, lsr r3 + and r3, r3, #7 @ eg_inc_val shift, may be 0 + ldrb r2, [r5,#0x17] @ state + +#if defined(SSG_EG) + tst r0, #0x08 @ ssg enabled? + tstne r12, #0x02 + bne 9f +#endif + + @ non-SSG-EG mode + cmp r2, #4 @ EG_ATT + ldrh r0, [r5,#0x1a] @ volume, unsigned (0-1023) + beq 4f + + cmp r2, #2 + mov r2, #1 + mov r2, r2, lsl r3 + mov r2, r2, lsr #1 @ eg_inc_val + add r0, r0, r2 + blt 1f @ EG_REL + beq 2f @ EG_SUS + +3: @ EG_DEC + ldr r2, [r5,#0x1c] @ sl (can be 16bit?) + mov r3, #EG_SUS + cmp r0, r2 @ if ( volume >= (INT32) SLOT->sl ) + strgeb r3, [r5,#0x17] @ state + b 10f + +4: @ EG_ATT + subs r3, r3, #1 @ eg_inc_val_shift - 1 + mvnpl r2, r0 + movpl r2, r2, lsl r3 + addpl r0, r0, r2, asr #4 + cmp r0, #0 @ if (volume <= MIN_ATT_INDEX) + bgt 10f + ldr r2, [r5,#0x1c] + mov r0, #0 + cmp r2, #0 + movne r3, #EG_DEC + moveq r3, #EG_SUS + strb r3, [r5,#0x17] @ state + b 10f + +2: @ EG_SUS + mov r2, #1024 + sub r2, r2, #1 @ r2 = MAX_ATT_INDEX + cmp r0, r2 @ if ( volume >= MAX_ATT_INDEX ) + movge r0, r2 + b 10f + +1: @ EG_REL + mov r2, #1024 + sub r2, r2, #1 @ r2 = MAX_ATT_INDEX + cmp r0, r2 @ if ( volume >= MAX_ATT_INDEX ) + movge r0, r2 + movge r3, #EG_OFF + strgeb r3, [r5,#0x17] @ state + +10: @ finish + ldrh r3, [r5,#0x18] @ tl + strh r0, [r5,#0x1a] @ volume +#if defined(SSG_EG) + b 11f + +9: @ SSG-EG mode + ldrh r0, [r5,#0x1a] @ volume, unsigned (0-1023) + cmp r2, #4 @ EG_ATT + beq 4f + + cmp r0, #0x200 @ if ( volume < 0x200 ) + movlt r0, #1 + movlt r3, r0, lsl r3 + ldrlth r0, [r5,#0x1a] @ volume, unsigned (0-1023) + movlt r3, r3, lsr #1 @ eg_inc_val + addlt r0, r0, r3, lsl #2 + + cmp r2, #2 + blt 1f @ EG_REL + beq 10f @ EG_SUS - nothing more to do + +3: @ EG_DEC + ldr r2, [r5,#0x1c] @ sl (can be 16bit?) + mov r3, #EG_SUS + cmp r0, r2 @ if ( volume >= (INT32) SLOT->sl ) + strgeb r3, [r5,#0x17] @ state + b 10f + +4: @ EG_ATT + subs r3, r3, #1 @ eg_inc_val_shift - 1 + mvnpl r2, r0 + movpl r2, r2, lsl r3 + addpl r0, r0, r2, asr #4 + cmp r0, #0 @ if (volume <= MIN_ATT_INDEX) + bgt 10f + ldr r2, [r5,#0x1c] + mov r0, #0 + cmp r2, #0 + movne r3, #EG_DEC + moveq r3, #EG_SUS + strb r3, [r5,#0x17] @ state + b 10f + +1: @ EG_REL + mov r2, #0x200 + cmp r0, r2 @ if ( volume >= 0x200 ) + movge r0, #1024 + subge r0, #1 + movge r3, #EG_OFF + strgeb r3, [r5,#0x17] @ state + +10: @ finish + ldrb r2, [r5,#0x30] @ ssg + ldrb r3, [r5,#0x17] @ state + strh r0, [r5,#0x1a] @ volume + cmp r2, #0x0c @ if ( ssg&0x04 && state > EG_REL ) + cmpge r3, #EG_REL+1 + ldrh r3, [r5,#0x18] @ tl + rsbge r0, r0, #0x200 @ volume = (0x200-volume) & MAX_ATT + lslge r0, r0, #22 + lsrge r0, r0, #22 + +11: +#endif + add r0, r0, r3 @ volume += tl + strh r0, [r5,#0x34] @ vol_out
View file
libretro-picodrive-0~git20200112.tar.xz/pico/state.c -> libretro-picodrive-0~git20200716.tar.xz/pico/state.c
Changed
@@ -254,6 +254,8 @@ CHECKED_WRITE_BUFF(CHUNK_ZRAM, PicoMem.zram); CHECKED_WRITE_BUFF(CHUNK_CRAM, PicoMem.cram); CHECKED_WRITE_BUFF(CHUNK_MISC, Pico.m); + + PicoVideoSave(); CHECKED_WRITE_BUFF(CHUNK_VIDEO, Pico.video); z80_pack(buff_z80); @@ -437,7 +439,11 @@ case CHUNK_CRAM: CHECKED_READ_BUFF(PicoMem.cram); break; case CHUNK_VSRAM: CHECKED_READ_BUFF(PicoMem.vsram); break; case CHUNK_MISC: CHECKED_READ_BUFF(Pico.m); break; - case CHUNK_VIDEO: CHECKED_READ_BUFF(Pico.video); break; + case CHUNK_VIDEO: + CHECKED_READ_BUFF(Pico.video); + PicoVideoLoad(); + break; + case CHUNK_IOPORTS: CHECKED_READ_BUFF(PicoMem.ioports); break; case CHUNK_PSG: CHECKED_READ2(28*4, sn76496_regs); break; case CHUNK_FM: @@ -563,7 +569,7 @@ z80_unpack(buff_z80); // due to dep from 68k cycles.. - Pico.t.m68c_aim = Pico.t.m68c_cnt; + Pico.t.m68c_frame_start = Pico.t.m68c_aim = Pico.t.m68c_cnt; if (PicoIn.AHW & PAHW_32X) Pico32xStateLoaded(0); if (PicoIn.AHW & PAHW_MCD)
View file
libretro-picodrive-0~git20200112.tar.xz/pico/videoport.c -> libretro-picodrive-0~git20200716.tar.xz/pico/videoport.c
Changed
@@ -2,6 +2,7 @@ * PicoDrive * (c) Copyright Dave, 2004 * (C) notaz, 2006-2009 + * (C) kub, 2020 * * This work is licensed under the terms of MAME license. * See COPYING file in the top-level directory. @@ -11,28 +12,330 @@ #define NEED_DMA_SOURCE #include "memory.h" -extern const unsigned char hcounts_32[]; -extern const unsigned char hcounts_40[]; +extern const unsigned char hcounts_32[], hcounts_40[]; +extern const unsigned char vdpcyc2sl_32_bl[], vdpcyc2sl_40_bl[]; +extern const unsigned char vdpcyc2sl_32[], vdpcyc2sl_40[]; +extern const unsigned short vdpsl2cyc_32_bl[], vdpsl2cyc_40_bl[]; +extern const unsigned short vdpsl2cyc_32[], vdpsl2cyc_40[]; -#ifndef UTYPES_DEFINED -typedef unsigned char u8; -typedef unsigned short u16; -typedef unsigned int u32; -#define UTYPES_DEFINED -#endif +static int blankline; // display disabled for this line + +unsigned SATaddr, SATmask; // VRAM addr of sprite attribute table int (*PicoDmaHook)(unsigned int source, int len, unsigned short **base, unsigned int *mask) = NULL; + +/* VDP FIFO implementation + * + * fifo_slot: last slot executed in this scanline + * fifo_cnt: #slots remaining for active FIFO write (#writes<<#bytep) + * fifo_total: #total FIFO entries pending + * fifo_data: last values transferred through fifo + * fifo_queue: fifo transfer queue (#writes, flags) + * + * FIFO states: empty total=0 + * inuse total>0 && total<4 + * full total==4 + * wait total>4 + * Conditions: + * fifo_slot is always behind slot2cyc[cycles]. Advancing it beyond cycles + * implies blocking the 68k up to that slot. + * + * A FIFO write goes to the end of the FIFO queue, but DMA running in background + * is always the last queue entry (transfers by CPU intervene and come 1st). + * There can be more pending writes than FIFO slots, but the CPU will be blocked + * until FIFO level (without background DMA) <= 4. + * This is only about correct timing, data xfer must be handled by the caller. + * Blocking the CPU means burning cycles via SekCyclesBurn*(), which is to be + * executed by the caller. + * + * FIFOSync "executes" FIFO write slots up to the given cycle in the current + * scanline. A queue entry completely executed is removed from the queue. + * FIFOWrite pushes writes to the transfer queue. If it's a blocking write, 68k + * is blocked if more than 4 FIFO writes are pending. + * FIFORead executes a 68k read. 68k is blocked until the next transfer slot. + */ + +// NB code assumes fifo_* arrays have size 2^n +static struct VdpFIFO { // XXX this must go into save file! + // last transferred FIFO data, ...x = index XXX currently only CPU + unsigned short fifo_data[4], fifo_dx; + + // queued FIFO transfers, ...x = index, ...l = queue length + // each entry has 2 values: [n]>>3 = #writes, [n]&7 = flags (FQ_*) + unsigned int fifo_queue[8], fifo_qx, fifo_ql; + unsigned int fifo_total; // total# of pending FIFO entries (w/o BGDMA) + + unsigned short fifo_slot; // last executed slot in current scanline + unsigned short fifo_maxslot;// #slots in scanline + + const unsigned char *fifo_cyc2sl; + const unsigned short *fifo_sl2cyc; +} VdpFIFO; + +enum { FQ_BYTE = 1, FQ_BGDMA = 2, FQ_FGDMA = 4 }; // queue flags, NB: BYTE = 1! + +// do the FIFO math +static __inline int AdvanceFIFOEntry(struct VdpFIFO *vf, struct PicoVideo *pv, int slots) +{ + int l = slots, b = vf->fifo_queue[vf->fifo_qx] & FQ_BYTE; + int cnt = pv->fifo_cnt; + + // advance currently active FIFO entry + if (l > cnt) + l = cnt; + if (!(vf->fifo_queue[vf->fifo_qx] & FQ_BGDMA)) + vf->fifo_total -= ((cnt & b) + l) >> b; + cnt -= l; + + // if entry has been processed... + if (cnt == 0) { + // remove entry from FIFO + if (vf->fifo_ql) { + vf->fifo_queue[vf->fifo_qx] = 0; + vf->fifo_qx = (vf->fifo_qx+1) & 7, vf->fifo_ql --; + } + // start processing for next entry if there is one + if (vf->fifo_ql) { + b = vf->fifo_queue[vf->fifo_qx] & FQ_BYTE; + cnt = (vf->fifo_queue[vf->fifo_qx] >> 3) << b; + } else { // FIFO empty + pv->status &= ~PVS_FIFORUN; + vf->fifo_total = 0; + } + } + + pv->fifo_cnt = cnt; + return l; +} + +static __inline void SetFIFOState(struct VdpFIFO *vf, struct PicoVideo *pv) +{ + unsigned int st = pv->status, cmd = pv->command; + // release CPU and terminate DMA if FIFO isn't blocking the 68k anymore + if (vf->fifo_total <= 4) { + st &= ~PVS_CPUWR; + if (!(st & (PVS_DMABG|PVS_DMAFILL))) { + st &= ~SR_DMA; + cmd &= ~0x80; + } + } + if (pv->fifo_cnt == 0) { + st &= ~PVS_CPURD; + // terminate DMA if applicable + if (!(st & (PVS_FIFORUN|PVS_DMAFILL))) { + st &= ~(SR_DMA|PVS_DMABG); + cmd &= ~0x80; + } + } + pv->status = st; + pv->command = cmd; +} + +// sync FIFO to cycles +void PicoVideoFIFOSync(int cycles) +{ + struct VdpFIFO *vf = &VdpFIFO; + struct PicoVideo *pv = &Pico.video; + int slots, done; + + // calculate #slots since last executed slot + slots = vf->fifo_cyc2sl[cycles>>1] - vf->fifo_slot; + + // advance FIFO queue by #done slots + done = slots; + while (done > 0 && pv->fifo_cnt) { + int l = AdvanceFIFOEntry(vf, pv, done); + vf->fifo_slot += l; + done -= l; + } + + if (done != slots) + SetFIFOState(vf, pv); +} + +// drain FIFO, blocking 68k on the way. FIFO must be synced prior to drain. +static int PicoVideoFIFODrain(int level, int cycles, int bgdma) +{ + struct VdpFIFO *vf = &VdpFIFO; + struct PicoVideo *pv = &Pico.video; + unsigned ocyc = cycles; + int burn = 0; +//int osl = fifo_slot; + + // process FIFO entries until low level is reached + while (vf->fifo_slot <= vf->fifo_maxslot && cycles < 488 && + ((vf->fifo_total > level) | (vf->fifo_queue[vf->fifo_qx] & bgdma))) { + int b = vf->fifo_queue[vf->fifo_qx] & FQ_BYTE; + int cnt = bgdma ? pv->fifo_cnt : ((vf->fifo_total-level)<<b) - (pv->fifo_cnt&b); + int slot = (pv->fifo_cnt<cnt ? pv->fifo_cnt:cnt) + vf->fifo_slot; + + if (slot > vf->fifo_maxslot) { + // target slot in later scanline, advance to eol + slot = vf->fifo_maxslot; + cycles = 488; + } else { + // advance FIFO to target slot and CPU to cycles at that slot + cycles = vf->fifo_sl2cyc[slot]<<1; + } + if (slot > vf->fifo_slot) { + AdvanceFIFOEntry(vf, pv, slot - vf->fifo_slot); + vf->fifo_slot = slot; + } + } + if (cycles > ocyc) + burn = cycles - ocyc; + + SetFIFOState(vf, pv); + + return burn; +} + +// read VDP data port
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/arm_utils.s -> libretro-picodrive-0~git20200716.tar.xz/platform/common/arm_utils.s
Changed
@@ -141,6 +141,7 @@ movne lr, #64 tstne r3, r3 addne r0, r0, #32 + addne r1, r1, #32 vidCpyM2_loop_out: mov r6, #10
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/common.mak -> libretro-picodrive-0~git20200716.tar.xz/platform/common/common.mak
Changed
@@ -9,6 +9,8 @@ asm_ym2612 = 0 asm_misc = 0 asm_cdmemory = 0 +asm_32xdraw = 0 +asm_32xmemory = 0 asm_mix = 0 endif @@ -40,6 +42,10 @@ DEFINES += PPROF SRCS_COMMON += $(R)platform/linux/pprof.c endif +ifeq "$(gperf)" "1" +DEFINES += GPERF +LDFLAGS += -lprofiler -lstdc++ +endif # ARM asm stuff ifeq "$(ARCH)" "arm" @@ -53,7 +59,7 @@ endif ifeq "$(asm_ym2612)" "1" DEFINES += _ASM_YM2612_C -SRCS_COMMON += $(R)pico/sound/ym2612_arm.s +SRCS_COMMON += $(R)pico/sound/ym2612_arm.S endif ifeq "$(asm_misc)" "1" DEFINES += _ASM_MISC_C @@ -66,7 +72,11 @@ endif ifeq "$(asm_32xdraw)" "1" DEFINES += _ASM_32X_DRAW -SRCS_COMMON += $(R)pico/32x/draw_arm.s +SRCS_COMMON += $(R)pico/32x/draw_arm.S +endif +ifeq "$(asm_32xmemory)" "1" +DEFINES += _ASM_32X_MEMORY_C +SRCS_COMMON += $(R)pico/32x/memory_arm.S endif ifeq "$(asm_mix)" "1" SRCS_COMMON += $(R)pico/sound/mix_arm.S @@ -138,7 +148,7 @@ # --- Z80 --- ifeq "$(use_drz80)" "1" DEFINES += _USE_DRZ80 -SRCS_COMMON += $(R)cpu/DrZ80/drz80.s +SRCS_COMMON += $(R)cpu/DrZ80/drz80.S endif # ifeq "$(use_cz80)" "1" @@ -157,8 +167,16 @@ ifdef drc_debug DEFINES += DRC_DEBUG=$(drc_debug) SRCS_COMMON += $(R)cpu/sh2/mame/sh2dasm.c -SRCS_COMMON += $(R)platform/libpicofe/linux/host_dasm.c -LDFLAGS += -lbfd -lopcodes -liberty +DASM = $(R)platform/libpicofe/linux/host_dasm.c +DASMLIBS = -lbfd -lopcodes -liberty +ifeq ("$(ARCH)",$(filter "$(ARCH)","arm" "mipsel")) +ifeq ($(filter_out $(shell $(CC) --print-file-name=libbfd.so),"/"),) +DASM = $(R)platform/common/host_dasm.c +DASMLIBS = +endif +endif +SRCS_COMMON += $(DASM) +LDFLAGS += $(DASMLIBS) endif endif # use_sh2drc SRCS_COMMON += $(R)cpu/sh2/mame/sh2pico.c @@ -181,7 +199,7 @@ $(FR)cpu/cyclone/Cyclone.s: $(FR)cpu/$(CYCLONE_CONFIG) @echo building Cyclone... - @make CC=$(CYCLONE_CC) CXX=$(CYCLONE_CXX) -C $(R)cpu/cyclone/ CONFIG_FILE=../$(CYCLONE_CONFIG) + @make CC=$(CYCLONE_CC) CXX=$(CYCLONE_CXX) -C $(R)cpu/cyclone/ CONFIG_FILE=../$(CYCLONE_CONFIG) HAVE_ARMv6=$(HAVE_ARMv6) $(FR)cpu/cyclone/Cyclone.s: $(FR)cpu/cyclone/*.cpp $(FR)cpu/cyclone/*.h
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/config_file.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/config_file.c
Changed
@@ -39,7 +39,7 @@ static int seek_sect(FILE *f, const char *section) { - char line[128], *tmp; + char line[640], *tmp; int len; len = strlen(section); @@ -100,7 +100,7 @@ FILE *fn = NULL; menu_entry *me; int t; - char line[128]; + char line[640]; fn = fopen(fname, "w"); if (fn == NULL) @@ -169,7 +169,7 @@ int config_writelrom(const char *fname) { - char line[128], *tmp, *optr = NULL; + char line[640], *tmp, *optr = NULL; char *old_data = NULL; int size; FILE *f; @@ -216,7 +216,7 @@ int config_readlrom(const char *fname) { - char line[128], *tmp; + char line[640], *tmp; int i, len, ret = -1; FILE *f; @@ -326,6 +326,10 @@ currentConfig.gamma = atoi(val); return 1; + case MA_OPT2_MAX_FRAMESKIP: + currentConfig.max_skip = atoi(val); + return 1; + /* PSP */ case MA_OPT3_SCALE: if (strcasecmp(var, "Scale factor") != 0) return 0; @@ -503,7 +507,7 @@ int config_readsect(const char *fname, const char *section) { - char line[128], *var, *val; + char line[640], *var, *val; int keys_encountered = 0; FILE *f; int ret;
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/disarm.c
Added
@@ -0,0 +1,485 @@ +/* + * Copyright (c) 2012 Wojtek Kaniewski <wojtekka@toxygen.net> + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include <stdio.h> +#include <string.h> + +#define IMM_FORMAT "0x%x" +//#define IMM_FORMAT "%d" +#define ADDR_FORMAT "0x%x" + +static inline unsigned int rol(unsigned int value, unsigned int shift) +{ + shift &= 31; + + return (value >> shift) | (value << (32 - shift)); +} + +static inline const char *condition(unsigned int insn) +{ + const char *conditions[16] = { "eq", "ne", "cs", "cc", "mi", "pl", "vs", "vc", "hi", "ls", "ge", "lt", "gt", "le", "", "nv" }; + return conditions[(insn >> 28) & 0x0f]; +} + +static inline const char *register_name(unsigned int reg) +{ + const char *register_names[16] = { "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "sp", "lr", "pc" }; + return register_names[reg & 0x0f]; +} + +static const char *register_list(unsigned int list, char *buf, size_t buf_len) +{ + int i; + + buf[0] = 0; + + for (i = 0; i < 16; i++) + { + if ((list >> i) & 1) + { + snprintf(buf + strlen(buf), buf_len - strlen(buf), "%s%s", (buf[0] == 0) ? "" : ",", register_name(i)); + } + } + + return buf; +} + +static const char *shift(unsigned int insn, char *buf, size_t buf_len) +{ + unsigned int imm = (insn >> 7) & 0x1f; + const char *rn = register_name(insn >> 8); + unsigned int type = (insn >> 4) & 0x07; + + switch (type) + { + case 0: + snprintf(buf, buf_len, (imm != 0) ? ",lsl #%d" : "", imm); + break; + case 1: + snprintf(buf, buf_len, ",lsl %s", rn); + break; + case 2: + snprintf(buf, buf_len, ",lsr #%d", imm ? imm : 32); + break; + case 3: + snprintf(buf, buf_len, ",lsr %s", rn); + break; + case 4: + snprintf(buf, buf_len, ",asr #%d", imm ? imm : 32); + break; + case 5: + snprintf(buf, buf_len, ",asr %s", rn); + break; + case 6: + snprintf(buf, buf_len, (imm != 0) ? ",ror #%d" : ",rrx", imm); + break; + case 7: + snprintf(buf, buf_len, ",ror %s", rn); + break; + } + + return buf; +} + +static const char *immediate(unsigned int imm, int negative, int show_if_zero, char *buf, size_t buf_len) +{ + if (imm || show_if_zero) + { + snprintf(buf, buf_len, ",#%s" IMM_FORMAT, (negative) ? "-" : "", imm); + return buf; + } + + return ""; +} + +static int data_processing(unsigned int pc, unsigned int insn, char *buf, size_t buf_len) +{ + unsigned int oper = (insn >> 21) & 15; + const char *names[16] = { "and", "eor", "sub", "rsb", "add", "adc", "sbc", "rsc", "tst", "teq", "cmp", "cmn", "orr", "mov", "bic", "mvn" }; + const char *name; + const char *s; + unsigned int rd; + unsigned int rn; + int is_move = ((oper == 13) || (oper == 15)); + int is_test = ((oper >= 8) && (oper <= 11)); + char tmp_buf[64]; + + name = names[oper]; + s = ((insn >> 20) & 1) ? "s" : ""; + rn = (insn >> 16) & 15; + rd = (insn >> 12) & 15; + + /* mov r0,r0,r0 is a nop */ + if (insn == 0xe1a00000) + { + snprintf(buf, buf_len, "nop"); + return 1; + } + + /* mrs */ + if ((insn & 0x0fbf0fff) == 0x010f0000) + { + const char *psr = ((insn >> 22) & 1) ? "spsr" : "cpsr"; + const char *rd = register_name(insn >> 12); + + snprintf(buf, buf_len, "mrs%s %s,%s", condition(insn), rd, psr); + + return 1; + } + + /* msr flag only*/ + if ((insn & 0x0db0f000) == 0x0120f000) + { + const char *psr = ((insn >> 22) & 1) ? "spsr" : "cpsr"; + const char *suffix; + + switch ((insn >> 16) & 15) + { + case 9: + suffix = ""; + break; + case 8: + suffix = "_f"; + break; + case 1: + suffix = "_c"; + break; + default: + return 0; + } + + if ((insn >> 25) & 1) + { + unsigned int imm = rol(insn & 0x000000ff, ((insn >> 8) & 15) * 2); + + snprintf(buf, buf_len, "msr%s %s%s,#" IMM_FORMAT, condition(insn), psr, suffix, imm); + } + else + { + const char *rm = register_name(insn >> 0); + + if (((insn >> 4) & 255) != 0) + { + return 0; + } + + snprintf(buf, buf_len, "msr%s %s%s,%s", condition(insn), psr, suffix, rm); + } + + return 1; + } + + if (((insn >> 25) & 1) == 0) + { + unsigned int rm; + + rm = (insn & 15); + + if (is_move) + { + snprintf(buf, buf_len, "%s%s%s %s,%s%s", name, condition(insn), s, register_name(rd), register_name(rm), shift(insn, tmp_buf, sizeof(tmp_buf)));
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/disarm.h
Added
@@ -0,0 +1,28 @@ +/* + * Copyright (C) 2012 Wojtek Kaniewski <wojtekka@toxygen.net> + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef DISARM_H +#define DISARM_H + +int disarm(uintptr_t pc, uint32_t insn, char *buf, size_t buf_len, unsigned long *sym); + +#endif /* DISARM_H */
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/dismips.c
Added
@@ -0,0 +1,412 @@ +/* + * very basic mips disassembler for MIPS32/MIPS64 Release 2, only for picodrive + * Copyright (C) 2019 kub + * + * This work is licensed under the terms of MAME license. + * See COPYING file in the top-level directory. + */ + +// unimplemented insns: MOV[FT], SYSCALL, BREAK, SYNC, SYNCI, T*, SDBBP, RDHWR, +// CACHE, PREF, LWC*/LDC*, SWC*/SDC*, and all of COP* (fpu, mmu, irq, exc, ...) +// unimplemented variants of insns: EHB, SSNOP (both SLL zero), JALR.HB, JR.HB +// however, it's certainly good enough for anything picodrive DRC throws at it. + +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <string.h> + +#include "dismips.h" + + +static char *const register_names[32] = { + "$zero", + "$at", + "$v0", + "$v1", + "$a0", + "$a1", + "$a2", + "$a3", + "$t0", + "$t1", + "$t2", + "$t3", + "$t4", + "$t5", + "$t6", + "$t7", + "$s0", + "$s1", + "$s2", + "$s3", + "$s4", + "$s5", + "$s6", + "$s7", + "$t8", + "$t9", + "$k0", + "$k1", + "$gp", + "$sp", + "$fp", + "$ra" +}; + + +enum insn_type { + REG_DTS, REG_TS, // 3, 2, or 1 regs + REG_DS, REG_DT, REG_D, REG_S, + S_IMM_DT, // 2 regs with shift amount + F_IMM_TS, // 2 regs with bitfield spec + B_IMM_S, B_IMM_TS, // pc-relative branches with 1 or 2 regs + J_IMM, // region-relative jump + A_IMM_TS, // arithmetic immediate with 2 regs + L_IMM_T, L_IMM_TS, // logical immediate with 1 or 2 regs + M_IMM_TS, // memory indexed with 2 regs + SR_BIT = 0x80 // shift right with R-bit +}; + +struct insn { + unsigned char op; + enum insn_type type; + char *name; +}; + +// ATTN: these array MUST be sorted by op (decode relies on it) + +// instructions with opcode SPECIAL (R-type) +#define OP_SPECIAL 0x00 +static const struct insn special_insns[] = { + {0x00, S_IMM_DT, "sll"}, +// {0x01, , "movf\0movt"}, + {0x02, S_IMM_DT|SR_BIT, "srl\0rotr"}, + {0x03, S_IMM_DT, "sra"}, + {0x04, REG_DTS, "sllv"}, + {0x06, REG_DTS|SR_BIT, "srlv\0rotrv"}, + {0x07, REG_DTS, "srav"}, + {0x08, REG_S, "jr"}, + {0x09, REG_DS, "jalr"}, + {0x0a, REG_DTS, "movz"}, + {0x0b, REG_DTS, "movn"}, +// {0x0c, , "syscall"}, +// {0x0d, , "break"}, +// {0x0f, , "sync"}, + {0x10, REG_D, "mfhi"}, + {0x11, REG_S, "mthi"}, + {0x12, REG_D, "mflo"}, + {0x13, REG_S, "mtlo"}, + {0x14, REG_DTS, "dsllv"}, + {0x16, REG_DTS|SR_BIT, "dsrlv\0drotrv"}, + {0x17, REG_DTS, "dsrav"}, + {0x18, REG_TS, "mult"}, + {0x19, REG_TS, "multu"}, + {0x1A, REG_TS, "div"}, + {0x1B, REG_TS, "divu"}, + {0x1C, REG_TS, "dmult"}, + {0x1D, REG_TS, "dmultu"}, + {0x1E, REG_TS, "ddiv"}, + {0x1F, REG_TS, "ddivu"}, + {0x20, REG_DTS, "add"}, + {0x21, REG_DTS, "addu"}, + {0x22, REG_DTS, "sub"}, + {0x23, REG_DTS, "subu"}, + {0x24, REG_DTS, "and"}, + {0x25, REG_DTS, "or"}, + {0x26, REG_DTS, "xor"}, + {0x27, REG_DTS, "nor"}, + {0x2A, REG_DTS, "slt"}, + {0x2B, REG_DTS, "sltu"}, + {0x2C, REG_DTS, "dadd"}, + {0x2D, REG_DTS, "daddu"}, + {0x2E, REG_DTS, "dsub"}, + {0x2F, REG_DTS, "dsubu"}, +// {0x30, REG_TS, "tge" }, +// {0x31, REG_TS, "tgeu" }, +// {0x32, REG_TS, "tlt" }, +// {0x33, REG_TS, "tltu" }, +// {0x34, REG_TS, "teq" }, +// {0x36, REG_TS, "tne" }, + {0x38, S_IMM_DT, "dsll"}, + {0x3A, S_IMM_DT|SR_BIT, "dsrl\0drotrv"}, + {0x3B, S_IMM_DT, "dsra"}, + {0x3C, S_IMM_DT, "dsll32"}, + {0x3E, S_IMM_DT|SR_BIT, "dsrl32\0drotr32"}, + {0x3F, S_IMM_DT, "dsra32"}, +}; + +// instructions with opcode SPECIAL2 (R-type) +#define OP_SPECIAL2 0x1C +static const struct insn special2_insns[] = { + {0x00, REG_TS, "madd" }, + {0x01, REG_TS, "maddu" }, + {0x02, REG_TS, "mul" }, + {0x04, REG_TS, "msub" }, + {0x05, REG_TS, "msubu" }, + {0x20, REG_DS, "clz" }, + {0x21, REG_DS, "clo" }, + {0x24, REG_DS, "dclz" }, + {0x25, REG_DS, "dclo" }, +// {0x37, , "sdbbp" }, +}; + +// instructions with opcode SPECIAL3 (R-type) +#define OP_SPECIAL3 0x1F +static const struct insn special3_insns[] = { + {0x00, F_IMM_TS, "ext" }, + {0x01, F_IMM_TS, "dextm" }, + {0x02, F_IMM_TS, "dextu" }, + {0x03, F_IMM_TS, "dext" }, + {0x04, F_IMM_TS, "ins" }, + {0x05, F_IMM_TS, "dinsm" }, + {0x06, F_IMM_TS, "dinsu" }, + {0x07, F_IMM_TS, "dins" }, +// {0x3b, , "rdhwr" }, +}; + +// instruction with opcode SPECIAL3 and function *BSHFL +#define FN_BSHFL 0x20 +static const struct insn bshfl_insns[] = { + {0x02, REG_DT, "wsbh" }, + {0x10, REG_DT, "seb" }, + {0x18, REG_DT, "seh" }, +}; +#define FN_DBSHFL 0x24 +static const struct insn dbshfl_insns[] = { + {0x02, REG_DT, "dsbh" }, + {0x05, REG_DT, "dshd" }, +}; + +// instructions with opcode REGIMM (I-type) +#define OP_REGIMM 0x01 +static const struct insn regimm_insns[] = { + {0x00, B_IMM_S, "bltz"}, + {0x01, B_IMM_S, "bgez"}, + {0x02, B_IMM_S, "bltzl"}, + {0x03, B_IMM_S, "bgezl"}, +// {0x08, , "tgei"}, +// {0x09, , "tgeiu"}, +// {0x0a, , "tlti"}, +// {0x0b, , "tltiu"}, +// {0x0c, , "teqi"}, +// {0x0e, , "tnei"}, + {0x10, B_IMM_S, "bltzal"}, + {0x11, B_IMM_S, "bgezal"}, + {0x12, B_IMM_S, "bltzall"}, + {0x13, B_IMM_S, "bgezall"}, + {0x13, B_IMM_S, "bgezall"}, +// {0x1f, , "synci" },
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/dismips.h
Added
@@ -0,0 +1,6 @@ +#ifndef DISMIPS_H +#define DISMIPS_H + +int dismips(uintptr_t pc, uint32_t insn, char *buf, size_t buf_len, unsigned long *sym); + +#endif /* DISMIPS_H */
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/emu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/emu.c
Changed
@@ -600,6 +600,7 @@ defaultConfig.turbo_rate = 15; defaultConfig.msh2_khz = PICO_MSH2_HZ / 1000; defaultConfig.ssh2_khz = PICO_SSH2_HZ / 1000; + defaultConfig.max_skip = 4; // platform specific overrides pemu_prep_defconfig(); @@ -1411,8 +1412,10 @@ { notice_msg_time = 0; plat_status_msg_clear(); +#ifndef __GP2X__ plat_video_flip(); plat_status_msg_clear(); /* Do it again in case of double buffering */ +#endif notice_msg = NULL; } else { @@ -1465,10 +1468,16 @@ else if (diff < -target_frametime_x3) { /* no time left for this frame - skip */ - /* limit auto frameskip to 8 */ - if (frames_done / 8 <= frames_shown) + /* limit auto frameskip to max_skip */ + if (fskip_cnt < currentConfig.max_skip) { + fskip_cnt++; skip = 1; - } + } + else { + fskip_cnt = 0; + } + } else + fskip_cnt = 0; // don't go in debt too much while (diff < -target_frametime_x3 * 3) {
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/emu.h -> libretro-picodrive-0~git20200716.tar.xz/platform/common/emu.h
Changed
@@ -76,6 +76,7 @@ int msh2_khz; int ssh2_khz; int overclock_68k; + int max_skip; } currentConfig_t; extern currentConfig_t currentConfig, defaultConfig;
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/helix
Added
+(directory)
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/helix/Makefile
Added
@@ -0,0 +1,43 @@ +CROSS ?= arm-linux-gnueabi- + +CC = $(CROSS)gcc +AS = $(CROSS)as +AR = $(CROSS)ar +TOOLCHAIN = $(notdir $(CROSS)) +LIBGCC ?= ${HOME}/opt/open2x/gcc-4.1.1-glibc-2.3.6/lib/gcc/arm-open2x-linux/4.1.1/libgcc.a + +CFLAGS += -Ipub -O2 -Wall -fstrict-aliasing -ffast-math +ifneq ($(findstring arm-,$(TOOLCHAIN)),) +CFLAGS += -mcpu=arm940t -mtune=arm940t -mfloat-abi=soft -mfpu=fpa -mabi=apcs-gnu -mno-thumb-interwork +ASFLAGS = -mcpu=arm940t -mfloat-abi=soft -mfpu=fpa -mabi=apcs-gnu +OBJS += real/arm/asmpoly_gcc.o +else +CFLAGS += -m32 +ASFLAGS += -m32 +OBJS += real/polyphase.o +endif + +LIB = $(TOOLCHAIN)helix_mp3.a +SHLIB = $(TOOLCHAIN)helix_mp3.so + +all: $(LIB) $(SHLIB) + + +OBJS += mp3dec.o mp3tabs.o +#OBJS += ipp/bitstream.o ipp/buffers.o ipp/dequant.o ipp/huffman.o ipp/imdct.o ipp/subband.o +OBJS += real/bitstream.o real/buffers.o real/dct32.o real/dequant.o real/dqchan.o real/huffman.o +OBJS += real/hufftabs.o real/imdct.o real/scalfact.o real/stproc.o real/subband.o real/trigtabs.o + +OBJS += lib.o + +real/arm/asmpoly_gcc.o: real/arm/asmpoly_gcc.s + $(CC) -o $@ $(ASFLAGS) -c $< + +$(LIB) : $(OBJS) + $(AR) r $@ $^ +$(SHLIB) : $(OBJS) $(LIBGCC) + $(CC) -o $@ -nostdlib -shared $(CFLAGS) $^ + +clean: + $(RM) -f $(OBJS) +
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/helix/lib.c
Added
@@ -0,0 +1,57 @@ +#include <stdlib.h> +#include <stdint.h> + +// libgcc has this with gcc 4.x +void raise(int sig) +{ +} + +// very limited heap functions for helix decoder + +static char heap[65000] __attribute__((aligned(16))); +static long heap_offs; + +void __malloc_init(void) +{ + heap_offs = 0; +} + +void *malloc(size_t size) +{ + void *chunk = heap + heap_offs; + size = (size+15) & ~15; + if (heap_offs + size > sizeof(heap)) + return NULL; + else { + heap_offs += size; + return chunk; + } +} + +void free(void *chunk) +{ + if (chunk == heap) + heap_offs = 0; +} + +#if 0 +void *memcpy (void *dest, const void *src, size_t n) +{ + char *_dest = dest; + const char *_src = src; + while (n--) *_dest++ = *_src++; + return dest; +} + +void *memmove (void *dest, const void *src, size_t n) +{ + char *_dest = dest+n; + const char *_src = src+n; + if (dest <= src || dest >= _src) + return memcpy(dest, src, n); + while (n--) *--_dest = *--_src; + return dest; +} +#else +#include "../memcpy.c" +#endif
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/host_dasm.c
Added
@@ -0,0 +1,93 @@ +/* + * DRC host disassembler interface for MIPS/ARM32 for use without binutils + * (C) kub, 2018,2019 + */ +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <string.h> + +#ifdef __mips__ +#include "dismips.c" +#define disasm dismips +#else +#include "disarm.c" +#define disasm disarm +#endif + +/* symbols */ +typedef struct { const char *name; void *value; } asymbol; + +static asymbol **symbols; +static long symcount, symstorage = 8; + +static const char *lookup_name(void *addr) +{ + asymbol **sptr = symbols; + int i; + + for (i = 0; i < symcount; i++) { + asymbol *sym = *sptr++; + + if (addr == sym->value) + return sym->name; + } + + return NULL; +} + +void host_dasm(void *addr, int len) +{ + void *end = (char *)addr + len; + const char *name; + char buf[64]; + unsigned long insn, symaddr; + + while (addr < end) { + name = lookup_name(addr); + if (name != NULL) + printf("%s:\n", name); + + insn = *(unsigned long *)addr; + printf(" %08lx %08lx ", (long)addr, insn); + if(disasm((unsigned)addr, insn, buf, sizeof(buf), &symaddr)) + { + if (symaddr) + name = lookup_name((void *)symaddr); + if (symaddr && name) + printf("%s <%s>\n", buf, name); + else if (symaddr && !name) + printf("%s <unknown>\n", buf); + else + printf("%s\n", buf); + } else + printf("unknown (0x%08lx)\n", insn); + addr = (char *)addr + sizeof(long); + } +} + +void host_dasm_new_symbol_(void *addr, const char *name) +{ + asymbol *sym, **tmp; + + if (symbols == NULL) + symbols = malloc(symstorage); + if (symstorage <= symcount * sizeof(symbols[0])) { + tmp = realloc(symbols, symstorage * 2); + if (tmp == NULL) + return; + symstorage *= 2; + symbols = tmp; + } + + symbols[symcount] = calloc(sizeof(*symbols[0]), 1); + if (symbols[symcount] == NULL) + return; + + // a HACK (should use correct section), but ohwell + sym = symbols[symcount]; + sym->value = addr; + sym->name = name; + symcount++; +} +
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/main.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/main.c
Changed
@@ -93,6 +93,10 @@ emu_init(); menu_init(); +#ifdef GPERF + ProfilerStart("gperf.out"); +#endif + engineState = PGS_Menu; if (argc > 1) @@ -148,6 +152,9 @@ } endloop: +#ifdef GPERF + ProfilerStop(); +#endif emu_finish(); plat_finish();
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/memcpy.c
Added
@@ -0,0 +1,134 @@ +/* + * (C) 2018 Kai-Uwe Bloem <derkub@gmail.com> + * + * 32bit ARM/MIPS optimized C implementation of memcpy and memove, designed for + * good performance with gcc. + * - if src and dest have the same alignment, 4-word copy is used. + * - if src and dest are unaligned to each other, still loads word data and + * stores correctly shifted word data (for all but the first and last bytes + * to avoid under/overstepping the src region). + * + * ATTN does dirty aliasing tricks with undefined behaviour by standard. + * (however, this improved the generated code). + * ATTN uses struct assignment, which only works if the compiler is inlining + * this (else it would probably call memcpy :-)). + */ +#include <stdlib.h> +#include <stdint.h> + +#include <endian.h> +#if __BYTE_ORDER == __LITTLE_ENDIAN +#define _L_ >> +#define _U_ << +#else +#define _L_ << +#define _U_ >> +#endif + +void *memcpy(void *dest, const void *src, size_t n) +{ + struct _16 { uint32_t a[4]; }; + union { const void *v; uint8_t *c; uint32_t *i; uint64_t *l; struct _16 *s; } + ss = { src }, ds = { dest }; + const int lm = sizeof(uint32_t)-1; + + /* align src to word */ + while (((uintptr_t)ss.c & lm) && n > 0) + *ds.c++ = *ss.c++, n--; + if (((uintptr_t)ds.c & lm) == 0) { + /* fast copy if pointers have the same aligment */ + while (n >= sizeof(struct _16)) /* copy 16 byte blocks */ + *ds.s++ = *ss.s++, n -= sizeof(struct _16); + if (n >= sizeof(uint64_t)) /* copy leftover 8 byte block */ + *ds.l++ = *ss.l++, n -= sizeof(uint64_t); +// if (n >= sizeof(uint32_t)) /* copy leftover 4 byte block */ +// *ds.i++ = *ss.i++, n -= sizeof(uint32_t); + } else if (n >= 2*sizeof(uint32_t)) { + /* unaligned data big enough to avoid overstepping src */ + uint32_t v1, v2, b, s; + /* align dest to word */ + while (((uintptr_t)ds.c & lm) && n > 0) + *ds.c++ = *ss.c++, n--; + /* copy loop: load aligned words and store shifted words */ + b = (uintptr_t)ss.c & lm, s = b*8; ss.c -= b; + v1 = *ss.i++, v2 = *ss.i++; + while (n >= 3*sizeof(uint32_t)) { + *ds.i++ = (v1 _L_ s) | (v2 _U_ (32-s)); v1 = *ss.i++; + *ds.i++ = (v2 _L_ s) | (v1 _U_ (32-s)); v2 = *ss.i++; + n -= 2*sizeof(uint32_t); + } + /* data for one more store is already loaded */ + if (n >= sizeof(uint32_t)) { + *ds.i++ = (v1 _L_ s) | (v2 _U_ (32-s)); + n -= sizeof(uint32_t); + ss.c += sizeof(uint32_t); + } + ss.c += b - 2*sizeof(uint32_t); + } + /* copy 0-7 leftover bytes */ + while (n >= 4) { + *ds.c++ = *ss.c++, n--; *ds.c++ = *ss.c++, n--; + *ds.c++ = *ss.c++, n--; *ds.c++ = *ss.c++, n--; + } + while (n > 0) + *ds.c++ = *ss.c++, n--; + return dest; +} + +void *memmove (void *dest, const void *src, size_t n) +{ + struct _16 { uint32_t a[4]; }; + union { const void *v; uint8_t *c; uint32_t *i; uint64_t *l; struct _16 *s; } + ss = { src+n }, ds = { dest+n }; + size_t pd = dest > src ? dest - src : src - dest; + const int lm = sizeof(uint32_t)-1; + + if (dest <= src || dest >= src+n) + return memcpy(dest, src, n); + + /* align src to word */ + while (((uintptr_t)ss.c & lm) && n > 0) + *--ds.c = *--ss.c, n--; + /* take care not to copy multi-byte data if it overlaps */ + if (((uintptr_t)ds.c & lm) == 0) { + /* fast copy if pointers have the same aligment */ + while (n >= sizeof(struct _16) && pd >= sizeof(struct _16)) + /* copy 16 bytes blocks if no overlap */ + *--ds.s = *--ss.s, n -= sizeof(struct _16); + while (n >= sizeof(uint64_t) && pd >= sizeof(uint64_t)) + /* copy leftover 8 byte blocks if no overlap */ + *--ds.l = *--ss.l, n -= sizeof(uint64_t); + while (n >= sizeof(uint32_t) && pd >= sizeof(uint32_t)) + /* copy leftover 4 byte blocks if no overlap */ + *--ds.i = *--ss.i, n -= sizeof(uint32_t); + } else if (n >= 2*sizeof(uint32_t) && pd >= 2*sizeof(uint32_t)) { + /* unaligned data big enough to avoid understepping src */ + uint32_t v1, v2, b, s; + /* align dest to word */ + while (((uintptr_t)ds.c & lm) && n > 0) + *--ds.c = *--ss.c, n--; + /* copy loop: load aligned words and store shifted words */ + b = (uintptr_t)ss.c & lm, s = b*8; ss.c += b; + v1 = *--ss.i, v2 = *--ss.i; + while (n >= 3*sizeof(uint32_t)) { + *--ds.i = (v1 _U_ s) | (v2 _L_ (32-s)); v1 = *--ss.i; + *--ds.i = (v2 _U_ s) | (v1 _L_ (32-s)); v2 = *--ss.i; + n -= 2*sizeof(uint32_t); + } + /* data for one more store is already loaded */ + if (n >= sizeof(uint32_t)) { + *--ds.i = (v1 _U_ s) | (v2 _L_ (32-s)); + n -= sizeof(uint32_t); + ss.c -= sizeof(uint32_t); + } + ss.c -= b - 2*sizeof(uint32_t); + } + /* copy 0-7 leftover bytes (or upto everything if ptrs are too close) */ + while (n >= 4) { + *--ds.c = *--ss.c, n--; *--ds.c = *--ss.c, n--; + *--ds.c = *--ss.c, n--; *--ds.c = *--ss.c, n--; + } + while (n > 0) + *--ds.c = *--ss.c, n--; + return dest; +}
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/menu_pico.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/menu_pico.c
Changed
@@ -499,6 +499,7 @@ mee_range_h ("Overclock M68k (%)", MA_OPT2_OVERCLOCK_M68K,currentConfig.overclock_68k, 0, 1000, h_ovrclk), mee_onoff ("Emulate Z80", MA_OPT2_ENABLE_Z80, PicoIn.opt, POPT_EN_Z80), mee_onoff ("Emulate YM2612 (FM)", MA_OPT2_ENABLE_YM2612, PicoIn.opt, POPT_EN_FM), + mee_onoff ("Disable YM2612 SSG-EG", MA_OPT2_DISABLE_YM_SSG,PicoIn.opt, POPT_DIS_FM_SSGEG), mee_onoff ("Emulate SN76496 (PSG)", MA_OPT2_ENABLE_SN76496,PicoIn.opt, POPT_EN_PSG), mee_onoff ("gzip savestates", MA_OPT2_GZIP_STATES, currentConfig.EmuOpt, EOPT_GZIP_SAVES), mee_onoff ("Don't save last used ROM", MA_OPT2_NO_LAST_ROM, currentConfig.EmuOpt, EOPT_NO_AUTOSVCFG), @@ -506,6 +507,8 @@ mee_onoff ("Disable frame limiter", MA_OPT2_NO_FRAME_LIMIT,currentConfig.EmuOpt, EOPT_NO_FRMLIMIT), mee_onoff ("Enable dynarecs", MA_OPT2_DYNARECS, PicoIn.opt, POPT_EN_DRC), mee_onoff ("Status line in main menu", MA_OPT2_STATUS_LINE, currentConfig.EmuOpt, EOPT_SHOW_RTC), + mee_range ("Max auto frameskip", MA_OPT2_MAX_FRAMESKIP, currentConfig.max_skip, 1, 10), + mee_onoff ("PWM IRQ optimization", MA_OPT2_PWM_IRQ_OPT, PicoIn.opt, POPT_PWM_IRQ_OPT), MENU_OPTIONS_ADV mee_end, }; @@ -920,7 +923,8 @@ } static const char credits[] = - "PicoDrive v" VERSION " (c) notaz, 2006-2013\n\n\n" + "PicoDrive v" VERSION "\n" + "(c) notaz, 2006-2013; irixxxx, 2018-2020\n\n" "Credits:\n" "fDave: initial code\n" #ifdef EMU_C68K @@ -936,6 +940,7 @@ "MAME devs: SH2, YM2612 and SN76496 cores\n" "Eke, Stef: some Sega CD code\n" "Inder, ketchupgun: graphics\n" + "Irixxxx: SH2 drc improvements\n" #ifdef __GP2X__ "Squidge: mmuhack\n" "Dzz: ARM940 sample\n"
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/menu_pico.h -> libretro-picodrive-0~git20200716.tar.xz/platform/common/menu_pico.h
Changed
@@ -48,6 +48,7 @@ MA_OPT2_VSYNC, MA_OPT2_ENABLE_Z80, MA_OPT2_ENABLE_YM2612, + MA_OPT2_DISABLE_YM_SSG, MA_OPT2_ENABLE_SN76496, MA_OPT2_GZIP_STATES, MA_OPT2_NO_LAST_ROM, @@ -58,6 +59,8 @@ MA_OPT2_NO_SPRITE_LIM, MA_OPT2_NO_IDLE_LOOPS, MA_OPT2_OVERCLOCK_M68K, + MA_OPT2_MAX_FRAMESKIP, + MA_OPT2_PWM_IRQ_OPT, MA_OPT2_DONE, MA_OPT3_SCALE, /* psp (all OPT3) */ MA_OPT3_HSCALE32,
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/mp3.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/mp3.c
Changed
@@ -21,33 +21,6 @@ 0, 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320 }; -int mp3_find_sync_word(const unsigned char *buf, int size) -{ - const unsigned char *p, *pe; - - /* find byte-aligned syncword - need 12 (MPEG 1,2) or 11 (MPEG 2.5) matching bits */ - for (p = buf, pe = buf + size - 3; p <= pe; p++) - { - int pn; - if (p[0] != 0xff) - continue; - pn = p[1]; - if ((pn & 0xf8) != 0xf8 || // currently must be MPEG1 - (pn & 6) == 0) { // invalid layer - p++; continue; - } - pn = p[2]; - if ((pn & 0xf0) < 0x20 || (pn & 0xf0) == 0xf0 || // bitrates - (pn & 0x0c) != 0) { // not 44kHz - continue; - } - - return p - buf; - } - - return -1; -} - static int try_get_bitrate(unsigned char *buf, int buf_size) { int offs1, offs = 0;
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/mp3.h -> libretro-picodrive-0~git20200716.tar.xz/platform/common/mp3.h
Changed
@@ -12,8 +12,8 @@ extern unsigned short mpeg1_l3_bitrates[16]; #ifdef __GP2X__ -void mp3_update_local(int *buffer, int length, int stereo); -void mp3_start_play_local(void *f, int pos); +int _mp3dec_start(FILE *f, int fpos_start); +int _mp3dec_decode(FILE *f, int *file_pos, int file_len); #endif #endif // __COMMON_MP3_H__
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/mp3_helix.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/mp3_helix.c
Changed
@@ -9,6 +9,7 @@ #include <stdio.h> #include <string.h> +#include <dlfcn.h> #include <pico/pico_int.h> #include <pico/sound/mix.h> @@ -20,10 +21,15 @@ static unsigned char mp3_input_buffer[2 * 1024]; #ifdef __GP2X__ -#define mp3_update mp3_update_local -#define mp3_start_play mp3_start_play_local +#define mp3dec_decode _mp3dec_decode +#define mp3dec_start _mp3dec_start #endif +static void *libhelix; +HMP3Decoder (*p_MP3InitDecoder)(void); +void (*p_MP3FreeDecoder)(HMP3Decoder); +int (*p_MP3Decode)(HMP3Decoder, unsigned char **, int *, short *, int); + int mp3dec_decode(FILE *f, int *file_pos, int file_len) { unsigned char *readPtr; @@ -51,7 +57,7 @@ bytesLeft -= offset; had_err = err; - err = MP3Decode(mp3dec, &readPtr, &bytesLeft, cdda_out_buffer, 0); + err = p_MP3Decode(mp3dec, &readPtr, &bytesLeft, cdda_out_buffer, 0); if (err) { if (err == ERR_MP3_MAINDATA_UNDERFLOW && !had_err) { // just need another frame @@ -86,10 +92,31 @@ int mp3dec_start(FILE *f, int fpos_start) { + if (libhelix == NULL) { + libhelix = dlopen("./libhelix.so", RTLD_NOW); + if (libhelix == NULL) { + lprintf("mp3dec: load libhelix.so: %s\n", dlerror()); + return -1; + } + + p_MP3InitDecoder = dlsym(libhelix, "MP3InitDecoder"); + p_MP3FreeDecoder = dlsym(libhelix, "MP3FreeDecoder"); + p_MP3Decode = dlsym(libhelix, "MP3Decode"); + + if (p_MP3InitDecoder == NULL || p_MP3FreeDecoder == NULL + || p_MP3Decode == NULL) + { + lprintf("mp3dec: missing symbol(s) in libhelix.so\n"); + dlclose(libhelix); + libhelix = NULL; + return -1; + } + } + // must re-init decoder for new track if (mp3dec) - MP3FreeDecoder(mp3dec); - mp3dec = MP3InitDecoder(); + p_MP3FreeDecoder(mp3dec); + mp3dec = p_MP3InitDecoder(); return (mp3dec == 0) ? -1 : 0; }
View file
libretro-picodrive-0~git20200716.tar.xz/platform/common/mp3_sync.c
Added
@@ -0,0 +1,27 @@ + +int mp3_find_sync_word(const unsigned char *buf, int size) +{ + const unsigned char *p, *pe; + + /* find byte-aligned syncword - need 12 (MPEG 1,2) or 11 (MPEG 2.5) matching bits */ + for (p = buf, pe = buf + size - 3; p <= pe; p++) + { + int pn; + if (p[0] != 0xff) + continue; + pn = p[1]; + if ((pn & 0xf8) != 0xf8 || // currently must be MPEG1 + (pn & 6) == 0) { // invalid layer + p++; continue; + } + pn = p[2]; + if ((pn & 0xf0) < 0x20 || (pn & 0xf0) == 0xf0 || // bitrates + (pn & 0x0c) != 0) { // not 44kHz + continue; + } + + return p - buf; + } + + return -1; +}
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/plat_sdl.c -> libretro-picodrive-0~git20200716.tar.xz/platform/common/plat_sdl.c
Changed
@@ -89,6 +89,8 @@ /* YUV stuff */ static int yuv_ry[32], yuv_gy[32], yuv_by[32]; static unsigned char yuv_u[32 * 2], yuv_v[32 * 2]; +static unsigned char yuv_y[256]; +static struct uyvy { unsigned int y:8; unsigned int vyu:24; } yuv_uyvy[65536]; void bgr_to_uyvy_init(void) { @@ -119,34 +121,40 @@ v = 255; yuv_v[i + 32] = v; } + // valid Y range seems to be 16..235 + for (i = 0; i < 256; i++) { + yuv_y[i] = 16 + 219 * i / 32; + } + // everything combined into one large array for speed + for (i = 0; i < 65536; i++) { + int r = (i >> 11) & 0x1f, g = (i >> 6) & 0x1f, b = (i >> 0) & 0x1f; + int y = (yuv_ry[r] + yuv_gy[g] + yuv_by[b]) >> 16; + yuv_uyvy[i].y = yuv_y[y]; + yuv_uyvy[i].vyu = (yuv_v[r-y + 32] << 16) | (yuv_y[y] << 8) | yuv_u[b-y + 32]; + } } void rgb565_to_uyvy(void *d, const void *s, int pixels) { - unsigned int *dst = d; - const unsigned short *src = s; - const unsigned char *yu = yuv_u + 32; - const unsigned char *yv = yuv_v + 32; - int r0, g0, b0, r1, g1, b1; - int y0, y1, u, v; - - for (; pixels > 0; src += 2, dst++, pixels -= 2) + uint32_t *dst = d; + const uint16_t *src = s; + + if (plat_sdl_overlay->w > 2*plat_sdl_overlay->h) + for (; pixels > 0; src += 4, dst += 4, pixels -= 4) + { + struct uyvy *uyvy0 = yuv_uyvy + src[0], *uyvy1 = yuv_uyvy + src[1]; + struct uyvy *uyvy2 = yuv_uyvy + src[2], *uyvy3 = yuv_uyvy + src[3]; + dst[0] = (uyvy0->y << 24) | uyvy0->vyu; + dst[1] = (uyvy1->y << 24) | uyvy1->vyu; + dst[2] = (uyvy2->y << 24) | uyvy2->vyu; + dst[3] = (uyvy3->y << 24) | uyvy3->vyu; + } else + for (; pixels > 0; src += 4, dst += 2, pixels -= 4) { - r0 = (src[0] >> 11) & 0x1f; - g0 = (src[0] >> 6) & 0x1f; - b0 = src[0] & 0x1f; - r1 = (src[1] >> 11) & 0x1f; - g1 = (src[1] >> 6) & 0x1f; - b1 = src[1] & 0x1f; - y0 = (yuv_ry[r0] + yuv_gy[g0] + yuv_by[b0]) >> 16; - y1 = (yuv_ry[r1] + yuv_gy[g1] + yuv_by[b1]) >> 16; - u = yu[b0 - y0]; - v = yv[r0 - y0]; - // valid Y range seems to be 16..235 - y0 = 16 + 219 * y0 / 31; - y1 = 16 + 219 * y1 / 31; - - *dst = (y1 << 24) | (v << 16) | (y0 << 8) | u; + struct uyvy *uyvy0 = yuv_uyvy + src[0], *uyvy1 = yuv_uyvy + src[1]; + struct uyvy *uyvy2 = yuv_uyvy + src[2], *uyvy3 = yuv_uyvy + src[3]; + dst[0] = (uyvy1->y << 24) | uyvy0->vyu; + dst[1] = (uyvy3->y << 24) | uyvy2->vyu; } } @@ -272,7 +280,7 @@ if (shadow_size < 320 * 480 * 2) shadow_size = 320 * 480 * 2; - shadow_fb = malloc(shadow_size); + shadow_fb = calloc(1, shadow_size); g_menubg_ptr = calloc(1, shadow_size); if (shadow_fb == NULL || g_menubg_ptr == NULL) { fprintf(stderr, "OOM\n");
View file
libretro-picodrive-0~git20200112.tar.xz/platform/common/version.h -> libretro-picodrive-0~git20200716.tar.xz/platform/common/version.h
Changed
@@ -1,1 +1,1 @@ -#define VERSION "1.92" +#define VERSION "1.96"
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gizmondo/Makefile -> libretro-picodrive-0~git20200716.tar.xz/platform/gizmondo/Makefile
Changed
@@ -98,6 +98,9 @@ endif +../../tools/textfilter: ../../tools/textfilter.c + make -C ../../tools/ textfilter + readme.txt: ../../tools/textfilter ../base_readme.txt ../../tools/textfilter ../base_readme.txt $@ GIZ
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gizmondo/emu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gizmondo/emu.c
Changed
@@ -155,7 +155,7 @@ } // a hack for VR if (PicoIn.AHW & PAHW_SVP) - memset32((int *)(Pico.est.Draw2FB+328*8+328*223), 0xe0e0e0e0, 328); + memset((int *)(Pico.est.Draw2FB+328*8+328*223), 0xe0e0e0e0, 328*4); if (!(Pico.video.reg[12]&1)) lines_flags|=0x10000; if (currentConfig.EmuOpt&0x4000) lines_flags|=0x40000; // (Pico.m.frame_count&1)?0x20000:0x40000; @@ -166,22 +166,25 @@ int lines_flags; // 8bit accurate renderer if (Pico.m.dirtyPal) { - Pico.m.dirtyPal = 0; - vidConvCpyRGB565(localPal, Pico.cram, 0x40); + if (Pico.m.dirtyPal == 2) + Pico.m.dirtyPal = 0; + /* no support + switch (Pico.est.SonicPalCount) { + case 3: vidConvCpyRGB565(localPal+0xc0, Pico.est.SonicPal+0xc0, 0x40); + case 2: vidConvCpyRGB565(localPal+0x80, Pico.est.SonicPal+0x80, 0x40); + case 1: vidConvCpyRGB565(localPal+0x40, Pico.est.SonicPal+0x40, 0x40); + default://vidConvCpyRGB565(localPal, Pico.est.SonicPal, 0x40); + } */ + vidConvCpyRGB565(localPal, Pico.est.SonicPal, 0x40); if (Pico.video.reg[0xC]&8) { // shadow/hilight mode - //vidConvCpyRGB32sh(localPal+0x40, Pico.cram, 0x40); - //vidConvCpyRGB32hi(localPal+0x80, Pico.cram, 0x40); // TODO? - memcpy32((void *)(localPal+0xc0), (void *)(localPal+0x40), 0x40*2/4); + //vidConvCpyRGB32sh(localPal+0x40, Pico.est.SonicPal, 0x40); + //vidConvCpyRGB32hi(localPal+0x80, Pico.est.SonicPal, 0x40); // TODO? + memcpy((void *)(localPal+0xc0), (void *)(localPal+0x40), 0x40*2); localPal[0xc0] = 0x0600; localPal[0xd0] = 0xc000; localPal[0xe0] = 0x0000; // reserved pixels for OSD localPal[0xf0] = 0xffff; } - /* no support - else if (rendstatus & 0x20) { // mid-frame palette changes - vidConvCpyRGB565(localPal+0x40, HighPal, 0x40); - vidConvCpyRGB565(localPal+0x80, HighPal+0x40, 0x40); - } */ } lines_flags = (Pico.video.reg[1]&8) ? 240 : 224; if (!(Pico.video.reg[12]&1)) lines_flags|=0x10000;
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gizmondo/menu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gizmondo/menu.c
Changed
@@ -54,7 +54,7 @@ void menu_draw_begin(int use_bgbuff) { if (use_bgbuff) - memcpy32((int *)menu_screen, (int *)bg_buffer, 321*240*2/4); + memcpy((int *)menu_screen, (int *)bg_buffer, 321*240*2); } @@ -66,7 +66,7 @@ lprintf("%s: Framework2D_LockBuffer() returned NULL\n", __FUNCTION__); return; } - memcpy32(giz_screen, (int *)menu_screen, 321*240*2/4); + memcpy(giz_screen, (int *)menu_screen, 321*240*2); fb_unlock(); giz_screen = NULL; fb_flip();
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/940ctl.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/940ctl.c
Changed
@@ -100,10 +100,10 @@ UINT16 *writebuff = shared_ctl->writebuffsel ? shared_ctl->writebuff0 : shared_ctl->writebuff1; /* detect rapid ym updates */ - if (upd && !(writebuff_ptr & 0x80000000) && scanline < 224) + if (upd && !(writebuff_ptr & 0x80000000)) { - int mid = Pico.m.pal ? 68 : 93; - if (scanline > mid) { + int mid = (Pico.m.pal ? 313 : 262) / 2; + if (scanline >= mid) { //printf("%05i:%03i: rapid ym\n", Pico.m.frame_count, scanline); writebuff[writebuff_ptr++ & 0xffff] = 0xfffe; writebuff_ptr |= 0x80000000; @@ -282,7 +282,7 @@ } -void YM2612Init_940(int baseclock, int rate) +void YM2612Init_940(int baseclock, int rate, int ssg) { static int oldrate; @@ -339,7 +339,7 @@ memset(shared_ctl, 0, sizeof(*shared_ctl)); /* cause local ym2612 to init REGS */ - YM2612Init_(baseclock, rate); + YM2612Init_(baseclock, rate, ssg); internal_reset(); @@ -425,8 +425,7 @@ int mp3dec_decode(FILE *f, int *file_pos, int file_len) { if (!(PicoIn.opt & POPT_EXT_FM)) { - //mp3_update_local(buffer, length, stereo); - return 0; + return _mp3dec_decode(f, file_pos, file_len); } // check if playback was started, track not ended @@ -457,8 +456,7 @@ int mp3dec_start(FILE *f, int fpos_start) { if (!(PicoIn.opt & POPT_EXT_FM)) { - //mp3_start_play_local(f, pos); - return -1; + return _mp3dec_start(f, fpos_start); } if (loaded_mp3 != f)
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/940ctl.h -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/940ctl.h
Changed
@@ -1,7 +1,7 @@ void sharedmem940_init(void); void sharedmem940_finish(void); -void YM2612Init_940(int baseclock, int rate); +void YM2612Init_940(int baseclock, int rate, int ssg); void YM2612ResetChip_940(void); int YM2612UpdateOne_940(int *buffer, int length, int stereo, int is_buf_empty);
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/Makefile -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/Makefile
Changed
@@ -11,7 +11,7 @@ all: rel ../../tools/textfilter: ../../tools/textfilter.c - make -C ../../tools/ + make -C ../../tools/ textfilter readme.txt: ../../tools/textfilter ../base_readme.txt ../../ChangeLog ../../tools/textfilter ../base_readme.txt $@ GP2X
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/PicoDrive.gpe -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/PicoDrive.gpe
Changed
@@ -7,6 +7,8 @@ export POLLUX_RAM_TIMINGS='ram_timings=2,9,4,1,1,1,1' export POLLUX_LCD_TIMINGS_NTSC='lcd_timings=397,1,37,277,341,0,17,337;clkdiv0=9' export POLLUX_LCD_TIMINGS_PAL='lcd_timings=428,1,37,277,341,0,17,337;clkdiv0=10' +else + export POLLUX_RAM_TIMINGS='ram_timings=3,9,4,1,1,1,1' fi ./PicoDrive "$@"
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/code940/940.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/code940/940.c
Changed
@@ -2,7 +2,7 @@ // (c) Copyright 2006-2007, Grazvydas "notaz" Ignotas #include "940shared.h" -#include "../../common/mp3.h" +#include "../../common/helix/pub/mp3dec.h" static _940_data_t *shared_data = (_940_data_t *) 0x00100000; static _940_ctl_t *shared_ctl = (_940_ctl_t *) 0x00200000; @@ -19,7 +19,7 @@ // is changed by other core just before we update it void set_if_not_changed(int *val, int oldval, int newval); -void _memcpy(void *dst, const void *src, int count); +extern void *memcpy(void *dest, const void *src, unsigned long n); // asm volatile ("mov r0, #0" ::: "r0"); // asm volatile ("mcr p15, 0, r0, c7, c6, 0" ::: "r0"); /* flush dcache */ @@ -153,6 +153,8 @@ int job = 0; ym2612_940 = &shared_data->ym2612; +// extern unsigned __bss_start__, __bss_end__; +// memset(&__bss_start__, 0, &__bss_end__ - &__bss_start__); for (;;) { @@ -165,8 +167,9 @@ case JOB940_INITALL: /* ym2612 */ shared_ctl->writebuff0[0] = shared_ctl->writebuff1[0] = 0xffff; - YM2612Init_(shared_ctl->baseclock, shared_ctl->rate); + YM2612Init_(shared_ctl->baseclock, shared_ctl->rate, 0); /* Helix mp3 decoder */ + __malloc_init(); shared_data->mp3dec = MP3InitDecoder(); break; @@ -185,7 +188,7 @@ case JOB940_PICOSTATESAVE2: YM2612PicoStateSave2(0, 0); - _memcpy(shared_ctl->writebuff0, ym2612_940->REGS, 0x200); + memcpy(shared_ctl->writebuff0, ym2612_940->REGS, 0x200); break; case JOB940_PICOSTATELOAD2_PREP: @@ -193,7 +196,7 @@ break; case JOB940_PICOSTATELOAD2: - _memcpy(ym2612_940->REGS, shared_ctl->writebuff0, 0x200); + memcpy(ym2612_940->REGS, shared_ctl->writebuff0, 0x200); YM2612PicoStateLoad2(0, 0); break; @@ -207,6 +210,7 @@ case JOB940_MP3RESET: if (shared_data->mp3dec) MP3FreeDecoder(shared_data->mp3dec); + __malloc_init(); shared_data->mp3dec = MP3InitDecoder(); break; } @@ -215,4 +219,3 @@ dcache_clean(); } } -
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/code940/Makefile -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/code940/Makefile
Changed
@@ -1,17 +1,23 @@ # you may or may not need to change this -#devkit_path = x:/stuff/dev/devkitgp2x/ -devkit_path ?= $(HOME)/opt/devkitGP2X/ -lgcc_path = $(devkit_path)lib/gcc/arm-linux/4.0.3/ -CROSS = arm-linux- +#devkit_path ?= $(HOME)/opt/devkitGP2X/ +#lgcc_path = $(devkit_path)lib/gcc/arm-linux/4.0.3/ #CROSS = $(devkit_path)bin/arm-linux- +#devkit_path ?= $(HOME)/opt/open2x +#lgcc_path = $(devkit_path)/gcc-4.1.1-glibc-2.3.6/lib/gcc/arm-open2x-linux/4.1.1/ +#CROSS ?= $(devkit_path)/gcc-4.1.1-glibc-2.3.6/bin/arm-open2x-linux- +#devkit_path ?= $(HOME)/opt/arm-unknown-linux-gnu +#lgcc_path = $(HOME)/opt/open2x/gcc-4.1.1-glibc-2.3.6/lib/gcc/arm-open2x-linux/4.1.1/ +#CROSS ?= $(devkit_path)/bin/arm-unknown-linux-gnu- +lgcc_path = $(HOME)/opt/open2x/gcc-4.1.1-glibc-2.3.6/lib/gcc/arm-open2x-linux/4.1.1/ +CROSS ?= arm-linux-gnueabi- # settings #up = 1 -CFLAGS += -O2 -Wall -fomit-frame-pointer -fstrict-aliasing -ffast-math -CFLAGS += -I../.. -I. -D__GP2X__ -DARM -CFLAGS += -mcpu=arm940t -mtune=arm940t -LDFLAGS = -static -s -e code940 -Ttext 0x0 -L$(lgcc_path) -lgcc +CFLAGS += -O2 -Wall -mno-thumb-interwork -fstrict-aliasing -ffast-math +CFLAGS += -I../../common/helix/pub -I../../.. -I. -D__GP2X__ -DARM +CFLAGS += -mcpu=arm940t -mtune=arm940t -mabi=apcs-gnu -mfloat-abi=soft -mfpu=fpa +LDFLAGS = -static -e code940 -Ttext 0x0 -L$(lgcc_path) -lgcc GCC = $(CROSS)gcc STRIP = $(CROSS)strip @@ -36,7 +42,9 @@ # stuff for 940 core # init, emu_control, emu -OBJS940 += 940init.o 940.o 940ym2612.o memcpy.o misc_arm.o mp3.o +OBJS940 += 940init.o 940.o 940ym2612.o misc_arm.o mp3_sync.o +# the asm memcpy code crashes job LOAD2 on 940. Possibly a globbered reg? +# OBJS940 += memcpy.o # the asm code seems to be faster when run on 920, but not on 940 for some reason # OBJS940 += ../../Pico/sound/ym2612_asm.o @@ -44,12 +52,13 @@ OBJS940 += uClibc/memset.o uClibc/s_floor.o uClibc/e_pow.o uClibc/e_sqrt.o uClibc/s_fabs.o OBJS940 += uClibc/s_scalbn.o uClibc/s_copysign.o uClibc/k_sin.o uClibc/k_cos.o uClibc/s_sin.o OBJS940 += uClibc/e_rem_pio2.o uClibc/k_rem_pio2.o uClibc/e_log.o uClibc/wrappers.o +LIBHELIX ?= ../../common/helix/$(notdir $(CROSS))helix_mp3.a $(BIN) : code940.elf @echo ">>>" $@ $(OBJCOPY) -O binary $< $@ -code940.elf : $(OBJS940) ../../common/helix/$(CROSS)helix-mp3.a +code940.elf : $(OBJS940) $(LIBHELIX) @echo ">>>" $@ $(LD) $^ $(LDFLAGS) -o $@ -Map code940.map @@ -64,8 +73,12 @@ @echo ">>>" $@ $(GCC) $(CFLAGS) -DEXTERNAL_YM2612 -c $< -o $@ -../../common/helix/helix_mp3.a: - @make -C ../../common/helix/ +mp3_sync.o: ../../common/mp3_sync.c + @echo ">>>" $@ + $(GCC) $(CFLAGS) -Os -DCODE940 -c $< -o $@ + +$(LIBHELIX): + @$(MAKE) -C ../../common/helix/ CROSS=$(CROSS) up: $(BIN) @@ -82,7 +95,7 @@ ## OBJSMP3T = mp3test.o ../gp2x.o ../asmutils.o ../usbjoy.o -mp3test.gpe : $(OBJSMP3T) ../helix/helix_mp3.a +mp3test.gpe : $(OBJSMP3T) $(LIBHELIX) $(GCC) -static -o $@ $^ $(STRIP) $@ @cp -v $@ /mnt/gp2x/mnt/sd
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/code940/memcpy.s -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/code940/memcpy.s
Changed
@@ -114,14 +114,12 @@ blt Lmemcpy_fl32 /* less than 32 bytes (12 from above) */ stmdb sp!, {r4, r7, r8, r9, r10} /* borrow r4 */ -/* blat 64 bytes at a time */ +/* blat 32 bytes at a time */ /* XXX for really big copies perhaps we should use more registers */ Lmemcpy_floop32: ldmia r1!, {r3, r4, r7, r8, r9, r10, r12, lr} stmia r0!, {r3, r4, r7, r8, r9, r10, r12, lr} -ldmia r1!, {r3, r4, r7, r8, r9, r10, r12, lr} -stmia r0!, {r3, r4, r7, r8, r9, r10, r12, lr} -subs r2, r2, #0x40 +subs r2, r2, #0x20 bge Lmemcpy_floop32 cmn r2, #0x10 @@ -314,14 +312,12 @@ subs r2, r2, #0x14 /* less than 32 bytes (12 from above) */ blt Lmemcpy_bl32 -/* blat 64 bytes at a time */ +/* blat 32 bytes at a time */ /* XXX for really big copies perhaps we should use more registers */ Lmemcpy_bloop32: ldmdb r1!, {r3, r4, r7, r8, r9, r10, r12, lr} stmdb r0!, {r3, r4, r7, r8, r9, r10, r12, lr} -ldmdb r1!, {r3, r4, r7, r8, r9, r10, r12, lr} -stmdb r0!, {r3, r4, r7, r8, r9, r10, r12, lr} -subs r2, r2, #0x40 +subs r2, r2, #0x20 bge Lmemcpy_bloop32 Lmemcpy_bl32:
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/code940/mp3test.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/code940/mp3test.c
Changed
@@ -13,7 +13,7 @@ //#include "emu.h" //#include "menu.h" #include "../asmutils.h" -#include "../helix/pub/mp3dec.h" +#include "../../helix/pub/mp3dec.h" /* we will need some gp2x internals here */ extern volatile unsigned short *gp2x_memregs; /* from minimal library rlyeh */
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/code940/uClibc/memset.s -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/code940/uClibc/memset.s
Changed
@@ -22,7 +22,7 @@ .text .global memset .type memset,%function - .align 4 + .align 2 memset: mov a4, a1
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/code940/uClibc/wrappers.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/code940/uClibc/wrappers.c
Changed
@@ -4,9 +4,17 @@ { return __ieee754_pow(x, y); } +double __pow_finite(double x, double y) +{ + return __ieee754_pow(x, y); +} double log(double x) { return __ieee754_log(x); } +double __log_finite(double x) +{ + return __ieee754_log(x); +}
View file
libretro-picodrive-0~git20200112.tar.xz/platform/gp2x/emu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/gp2x/emu.c
Changed
@@ -55,7 +55,7 @@ gp2x_soc_t soc; defaultConfig.CPUclock = default_cpu_clock; - defaultConfig.renderer32x = RT_8BIT_FAST; + defaultConfig.renderer32x = RT_8BIT_ACC; defaultConfig.analog_deadzone = 50; soc = soc_detect(); @@ -291,38 +291,51 @@ } static int localPal[0x100]; +static int localPalSize; + static void (*vidcpyM2)(void *dest, void *src, int m32col, int with_32c_border); static int (*make_local_pal)(int fast_mode); static int make_local_pal_md(int fast_mode) { - int pallen = 0xc0; - - bgr444_to_rgb32(localPal, Pico.cram); - if (fast_mode) - return 0x40; + int pallen = 0x100; - if (Pico.video.reg[0xC] & 8) { // shadow/hilight mode - bgr444_to_rgb32_sh(localPal, Pico.cram); - localPal[0xc0] = 0x0000c000; - localPal[0xd0] = 0x00c00000; - localPal[0xe0] = 0x00000000; // reserved pixels for OSD - localPal[0xf0] = 0x00ffffff; - pallen = 0x100; + if (fast_mode) { + bgr444_to_rgb32(localPal, PicoMem.cram); + pallen = 0x40; + Pico.m.dirtyPal = 0; } else if (Pico.est.rendstatus & PDRAW_SONIC_MODE) { // mid-frame palette changes - bgr444_to_rgb32(localPal+0x40, Pico.est.HighPal); - bgr444_to_rgb32(localPal+0x80, Pico.est.HighPal+0x40); + switch (Pico.est.SonicPalCount) { + case 3: bgr444_to_rgb32(localPal+0xc0, Pico.est.SonicPal+0xc0); + case 2: bgr444_to_rgb32(localPal+0x80, Pico.est.SonicPal+0x80); + case 1: bgr444_to_rgb32(localPal+0x40, Pico.est.SonicPal+0x40); + default:bgr444_to_rgb32(localPal, Pico.est.SonicPal); + } + pallen = (Pico.est.SonicPalCount+1)*0x40; } - else - memcpy(localPal + 0x80, localPal, 0x40 * 4); // for spr prio mess + else if (Pico.video.reg[0xC] & 8) { // shadow/hilight mode + bgr444_to_rgb32(localPal, Pico.est.SonicPal); + bgr444_to_rgb32_sh(localPal, Pico.est.SonicPal); + } + else { + bgr444_to_rgb32(localPal, Pico.est.SonicPal); + memcpy(localPal+0x40, localPal, 0x40*4); // for spr prio mess + memcpy(localPal+0x80, localPal, 0x80*4); // for spr prio mess + } + localPal[0xc0] = 0x0000c000; + localPal[0xd0] = 0x00c00000; + localPal[0xe0] = 0x00000000; // reserved pixels for OSD + localPal[0xf0] = 0x00ffffff; + if (Pico.m.dirtyPal == 2) + Pico.m.dirtyPal = 0; return pallen; } static int make_local_pal_sms(int fast_mode) { - unsigned short *spal = Pico.cram; + unsigned short *spal = PicoMem.cram; unsigned int *dpal = (void *)localPal; unsigned int i, t; @@ -334,25 +347,21 @@ *dpal++ = t; } + Pico.m.dirtyPal = 0; return 0x40; } void pemu_finalize_frame(const char *fps, const char *notice) { int emu_opt = currentConfig.EmuOpt; - int ret; if (PicoIn.AHW & PAHW_32X) - ; // nothing to do + localPalSize = 0; // nothing to do else if (get_renderer() == RT_8BIT_FAST) { // 8bit fast renderer - if (Pico.m.dirtyPal) { - Pico.m.dirtyPal = 0; - ret = make_local_pal(1); - // feed new palette to our device - gp2x_video_setpalette(localPal, ret); - } + if (Pico.m.dirtyPal) + localPalSize = make_local_pal(1); // a hack for VR if (PicoIn.AHW & PAHW_SVP) memset32((int *)(Pico.est.Draw2FB+328*8+328*223), 0xe0e0e0e0, 328); @@ -364,12 +373,9 @@ { // 8bit accurate renderer if (Pico.m.dirtyPal) - { - Pico.m.dirtyPal = 0; - ret = make_local_pal(0); - gp2x_video_setpalette(localPal, ret); - } + localPalSize = make_local_pal(0); } + else localPalSize = 0; // no palette in 16bit mode if (notice) osd_text(4, osd_y, notice); @@ -385,6 +391,10 @@ { int stride = g_screen_width; gp2x_video_flip(); + // switching the palette takes immediate effect, whilst flipping only + // takes effect with the next vsync; unavoidable flicker may occur! + if (localPalSize) + gp2x_video_setpalette(localPal, localPalSize); if (is_16bit_mode()) stride *= 2; @@ -502,9 +512,6 @@ if (renderer == RT_16BIT && (currentConfig.EmuOpt & EOPT_WIZ_TEAR_FIX)) { PicoDrawSetOutFormat(PDF_RGB555, 1); } - else { - PicoDrawSetOutFormat(PDF_NONE, 0); - } PicoDrawSetOutBuf(g_screen_ptr, g_screen_width * 2); gp2x_mode = 16; } @@ -537,10 +544,7 @@ localPal[0xe0] = 0x00000000; // reserved pixels for OSD localPal[0xf0] = 0x00ffffff; gp2x_video_setpalette(localPal, 0x100); - gp2x_memset_all_buffers(0, 0xe0, 320*240); } - else - gp2x_memset_all_buffers(0, 0, 320*240*2); if (currentConfig.EmuOpt & EOPT_WIZ_TEAR_FIX) gp2x_mode = -gp2x_mode; @@ -723,6 +727,8 @@ PicoDrawSetCallbacks(NULL, NULL); Pico.m.dirtyPal = 1; + if (!no_scale) + no_scale = currentConfig.scaling == EOPT_SCALE_NONE; emu_cmn_forced_frame(no_scale, do_emu); g_menubg_src_ptr = g_screen_ptr;
View file
libretro-picodrive-0~git20200112.tar.xz/platform/libpicofe/fonts.h -> libretro-picodrive-0~git20200716.tar.xz/platform/libpicofe/fonts.h
Changed
@@ -1,6 +1,6 @@ extern unsigned char fontdata8x8[64*16]; -extern unsigned char fontdata6x8[256-32][8]; +extern unsigned char fontdata6x8[256][8]; void basic_text_out16_nf(void *fb, int w, int x, int y, const char *text); void basic_text_out16(void *fb, int w, int x, int y, const char *texto, ...);
View file
libretro-picodrive-0~git20200112.tar.xz/platform/libpicofe/linux/host_dasm.c -> libretro-picodrive-0~git20200716.tar.xz/platform/libpicofe/linux/host_dasm.c
Changed
@@ -22,11 +22,31 @@ static struct disassemble_info di; -#ifdef __arm__ +#if defined __arm__ #define print_insn_func print_insn_little_arm #define BFD_ARCH bfd_arch_arm #define BFD_MACH bfd_mach_arm_unknown #define DASM_OPTS "reg-names-std" +#elif defined __aarch64__ +#define print_insn_func print_insn_aarch64 +#define BFD_ARCH bfd_arch_aarch64 +#define BFD_MACH bfd_mach_aarch64 +#define DASM_OPTS NULL +#elif defined __mips__ +#define print_insn_func print_insn_little_mips +#define BFD_ARCH bfd_arch_mips +#define BFD_MACH bfd_mach_mipsisa32 +#define DASM_OPTS NULL +#elif defined __riscv +#define print_insn_func print_insn_riscv +#define BFD_ARCH bfd_arch_riscv +#define BFD_MACH bfd_mach_riscv64 +#define DASM_OPTS NULL +#elif defined __powerpc__ +#define print_insn_func print_insn_little_powerpc +#define BFD_ARCH bfd_arch_powerpc +#define BFD_MACH bfd_mach_ppc64 +#define DASM_OPTS NULL #elif defined(__x86_64__) || defined(__i386__) #define print_insn_func print_insn_i386_intel #define BFD_ARCH bfd_arch_i386
View file
libretro-picodrive-0~git20200112.tar.xz/platform/libpicofe/plat_sdl.c -> libretro-picodrive-0~git20200716.tar.xz/platform/libpicofe/plat_sdl.c
Changed
@@ -31,7 +31,7 @@ static int window_w, window_h; static int fs_w, fs_h; static int old_fullscreen; -static int vout_mode_overlay = -1, vout_mode_gl = -1; +static int vout_mode_overlay = -1, vout_mode_overlay2x = -1, vout_mode_gl = -1; static void *display, *window; static int gl_quirks; @@ -52,6 +52,7 @@ // invalid method might come from config.. if (plat_target.vout_method != 0 && plat_target.vout_method != vout_mode_overlay + && plat_target.vout_method != vout_mode_overlay2x && plat_target.vout_method != vout_mode_gl) { fprintf(stderr, "invalid vout_method: %d\n", plat_target.vout_method); @@ -96,8 +97,10 @@ } } - if (plat_target.vout_method == vout_mode_overlay) { - plat_sdl_overlay = SDL_CreateYUVOverlay(w, h, SDL_UYVY_OVERLAY, plat_sdl_screen); + if (plat_target.vout_method == vout_mode_overlay + || plat_target.vout_method == vout_mode_overlay2x) { + int W = plat_target.vout_method == vout_mode_overlay2x && w == 320 ? 2*w : w; + plat_sdl_overlay = SDL_CreateYUVOverlay(W, h, SDL_UYVY_OVERLAY, plat_sdl_screen); if (plat_sdl_overlay != NULL) { if ((long)plat_sdl_overlay->pixels[0] & 3) fprintf(stderr, "warning: overlay pointer is unaligned\n"); @@ -176,7 +179,7 @@ int plat_sdl_init(void) { - static const char *vout_list[] = { NULL, NULL, NULL, NULL }; + static const char *vout_list[] = { NULL, NULL, NULL, NULL, NULL }; const SDL_VideoInfo *info; SDL_SysWMinfo wminfo; int overlay_works = 0; @@ -277,6 +280,10 @@ if (overlay_works) { plat_target.vout_method = vout_mode_overlay = i; vout_list[i++] = "Video Overlay"; +#ifdef SDL_OVERLAY_2X + vout_mode_overlay2x = i; + vout_list[i++] = "Video Overlay 2x"; +#endif } if (gl_works) { plat_target.vout_method = vout_mode_gl = i;
View file
libretro-picodrive-0~git20200112.tar.xz/platform/libretro/libretro-common/include/libretro_gskit_ps2.h -> libretro-picodrive-0~git20200716.tar.xz/platform/libretro/libretro-common/include/libretro_gskit_ps2.h
Changed
@@ -1,4 +1,4 @@ -/* Copyright (C) 2010-2018 The RetroArch team +/* Copyright (C) 2010-2020 The RetroArch team * * --------------------------------------------------------------------------------------------- * The following license statement only applies to this libretro API header (libretro_d3d.h) @@ -33,7 +33,7 @@ #include <gsKit.h> -#define RETRO_HW_RENDER_INTERFACE_GSKIT_PS2_VERSION 1 +#define RETRO_HW_RENDER_INTERFACE_GSKIT_PS2_VERSION 2 struct retro_hw_ps2_insets { @@ -57,8 +57,6 @@ * in this interface. */ GSTEXTURE *coreTexture; - bool clearTexture; - bool updatedPalette; struct retro_hw_ps2_insets padding; }; typedef struct retro_hw_render_interface_gskit_ps2 RETRO_HW_RENDER_INTEFACE_GSKIT_PS2;
View file
libretro-picodrive-0~git20200112.tar.xz/platform/libretro/libretro.c -> libretro-picodrive-0~git20200716.tar.xz/platform/libretro/libretro.c
Changed
@@ -88,7 +88,7 @@ #define VOUT_32BIT_WIDTH 256 #endif #define VOUT_MAX_HEIGHT 240 -#define SND_RATE 44100 +#define INITIAL_SND_RATE 44100 static const float VOUT_PAR = 0.0; static const float VOUT_4_3 = (224.0f * (4.0f / 3.0f)); @@ -112,10 +112,12 @@ static struct retro_hw_ps2_insets padding; #endif -static short ALIGNED(4) sndBuffer[2*SND_RATE/50]; +static short ALIGNED(4) sndBuffer[2*INITIAL_SND_RATE/50]; static void snd_write(int len); +char **g_argv; + #ifdef _WIN32 #define SLASH '\\' #else @@ -533,10 +535,6 @@ memset(vout_buf, 0, vout_width * VOUT_MAX_HEIGHT); memset(retro_palette, 0, gsKit_texture_size_ee(16, 16, GS_PSM_CT16)); PicoDrawSetOutBuf(vout_buf, vout_width); - - if (ps2) { - ps2->clearTexture = true; - } #else vout_width = is_32cols ? VOUT_32BIT_WIDTH : VOUT_MAX_WIDTH; memset(vout_buf, 0, VOUT_MAX_WIDTH * VOUT_MAX_HEIGHT * 2); @@ -569,6 +567,8 @@ void emu_32x_startup(void) { + PicoDrawSetOutFormat(PDF_RGB555, 0); + PicoDrawSetOutBuf(vout_buf, vout_width * 2); } void lprintf(const char *fmt, ...) @@ -635,7 +635,7 @@ memset(info, 0, sizeof(*info)); info->timing.fps = Pico.m.pal ? 50 : 60; - info->timing.sample_rate = SND_RATE; + info->timing.sample_rate = PicoIn.sndRate; info->geometry.base_width = vout_width; info->geometry.base_height = vout_height; info->geometry.max_width = vout_width; @@ -1301,6 +1301,7 @@ struct retro_variable var; int OldPicoRegionOverride; float old_user_vout_width; + double new_sound_rate; var.value = NULL; var.key = "picodrive_input1"; @@ -1431,6 +1432,29 @@ if (environ_cb(RETRO_ENVIRONMENT_GET_VARIABLE, &var) && var.value) { PicoIn.sndFilterRange = (atoi(var.value) * 65536) / 100; } + + var.value = NULL; + var.key = "picodrive_renderer"; + if (environ_cb(RETRO_ENVIRONMENT_GET_VARIABLE, &var) && var.value) { + if (strcmp(var.value, "fast") == 0) + PicoIn.opt |= POPT_ALT_RENDERER; + else + PicoIn.opt &= ~POPT_ALT_RENDERER; + } + + var.value = NULL; + var.key = "picodrive_sound_rate"; + if (environ_cb(RETRO_ENVIRONMENT_GET_VARIABLE, &var) && var.value) { + new_sound_rate = atoi(var.value); + if (new_sound_rate != PicoIn.sndRate) { + /* Update the sound rate */ + PicoIn.sndRate = new_sound_rate; + PsndRerate(1); + struct retro_system_av_info av_info; + retro_get_system_av_info(&av_info); + environ_cb(RETRO_ENVIRONMENT_SET_SYSTEM_AV_INFO, &av_info); + } + } } void retro_run(void) @@ -1524,7 +1548,6 @@ } Pico.m.dirtyPal = 0; - ps2->updatedPalette = true; } if (PicoIn.AHW & PAHW_SMS) { @@ -1536,6 +1559,25 @@ ps2->padding = padding; #else + if (PicoIn.opt & POPT_ALT_RENDERER) { + /* In retro_init, PicoDrawSetOutBuf is called to make sure the output gets written to vout_buf, but this only + * applies to the line renderer (pico/draw.c). The faster tile-based renderer (pico/draw2.c) enabled by + * POPT_ALT_RENDERER writes to Pico.est.Draw2FB, so we need to manually copy that to vout_buf. + */ + /* This section is mostly copied from pemu_finalize_frame in platform/linux/emu.c */ + unsigned short *pd = (unsigned short *)vout_buf; + /* Skip the leftmost 8 columns (it seems to be used as some sort of caching or overscan area) */ + unsigned char *ps = Pico.est.Draw2FB + 8; + unsigned short *pal = Pico.est.HighPal; + int x; + if (Pico.m.dirtyPal) + PicoDrawUpdateHighPal(); + /* Copy up to the max height to include the overscan area, and skip the leftmost 8 columns again */ + for (i = 0; i < VOUT_MAX_HEIGHT; i++, ps += 8) + for (x = 0; x < vout_width; x++) + *pd++ = pal[*ps++]; + } + buff = (char*)vout_buf + vout_offset; #endif @@ -1574,7 +1616,13 @@ #endif PicoIn.opt |= POPT_EN_DRC; #endif - PicoIn.sndRate = SND_RATE; + + struct retro_variable var = { .key = "picodrive_sound_rate" }; + if (environ_cb(RETRO_ENVIRONMENT_GET_VARIABLE, &var) && var.value) + PicoIn.sndRate = atoi(var.value); + else + PicoIn.sndRate = INITIAL_SND_RATE; + PicoIn.autoRgnOrder = 0x184; // US, EU, JP vout_width = VOUT_MAX_WIDTH;
View file
libretro-picodrive-0~git20200112.tar.xz/platform/libretro/libretro_core_options.h -> libretro-picodrive-0~git20200716.tar.xz/platform/libretro/libretro_core_options.h
Changed
@@ -200,6 +200,32 @@ }, "60" }, +#if !defined(RENDER_GSKIT_PS2) + { + "picodrive_renderer", + "Renderer", + "Fast renderer can't render any mid-frame image changes so it is useful only for some games.", + { + { "accurate", NULL }, + { "fast", NULL }, + { NULL, NULL }, + }, + "accurate" + }, +#endif + { + "picodrive_sound_rate", + "Sound quality", + "Sound quality (in Hz). A lower value may increase performance.", + { + { "16000", NULL }, + { "22050", NULL }, + { "32000", NULL }, + { "44100", NULL }, + { NULL, NULL }, + }, + "44100" + }, { NULL, NULL, NULL, {{0}}, NULL }, };
View file
libretro-picodrive-0~git20200112.tar.xz/platform/linux/blit.c -> libretro-picodrive-0~git20200716.tar.xz/platform/linux/blit.c
Changed
@@ -61,10 +61,11 @@ for (i = 0; i < 224; i++) { ps += 8; + ps += 32; pd += 32; for (u = 0; u < 256; u++) *pd++ = *ps++; - ps += 64; + ps += 32; pd += 32; } } else {
View file
libretro-picodrive-0~git20200112.tar.xz/platform/linux/emu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/linux/emu.c
Changed
@@ -29,7 +29,7 @@ void pemu_validate_config(void) { -#if !defined(__arm__) && !defined(__i386__) && !defined(__x86_64__) +#if !defined(__arm__) && !defined(__aarch64__) && !defined(__mips__) && !defined(__riscv__) && !defined(__riscv) && !defined(__powerpc__) && !defined(__i386__) && !defined(__x86_64__) PicoIn.opt &= ~POPT_EN_DRC; #endif } @@ -39,10 +39,11 @@ int led_reg, pitch, scr_offs, led_offs; led_reg = Pico_mcd->s68k_regs[0]; - pitch = 320; + pitch = g_screen_ppitch; led_offs = 4; scr_offs = pitch * 2 + 4; +#if 0 if (currentConfig.renderer != RT_16BIT) { #define p(x) px[(x) >> 2] // 8-bit modes @@ -52,7 +53,9 @@ p(pitch*0) = p(pitch*1) = p(pitch*2) = col_g; p(pitch*0 + led_offs) = p(pitch*1 + led_offs) = p(pitch*2 + led_offs) = col_r; #undef p - } else { + } else +#endif + { #define p(x) px[(x)*2 >> 2] = px[((x)*2 >> 2) + 1] // 16-bit modes unsigned int *px = (unsigned int *)((short *)g_screen_ptr + scr_offs); @@ -71,8 +74,8 @@ unsigned char *ps = Pico.est.Draw2FB + 328*8 + 8; unsigned short *pal = Pico.est.HighPal; int i, x; - if (Pico.m.dirtyPal) - PicoDrawUpdateHighPal(); + + PicoDrawUpdateHighPal(); for (i = 0; i < 224; i++, ps += 8) for (x = 0; x < 320; x++) *pd++ = pal[*ps++]; @@ -109,6 +112,8 @@ if (PicoIn.AHW & PAHW_32X) PicoDrawSetOutBuf(g_screen_ptr, g_screen_ppitch * 2); + + Pico.m.dirtyPal = 1; } void plat_video_toggle_renderer(int change, int is_menu) @@ -174,7 +179,10 @@ void emu_video_mode_change(int start_line, int line_count, int is_32cols) { // clear whole screen in all buffers - memset32(g_screen_ptr, 0, g_screen_ppitch * g_screen_height * 2 / 4); + if (currentConfig.renderer != RT_16BIT && !(PicoIn.AHW & PAHW_32X)) + memset32(Pico.est.Draw2FB, 0, (320+8) * (8+240+8) / 4); + else + memset32(g_screen_ptr, 0, g_screen_ppitch * g_screen_height * 2 / 4); } void pemu_loop_prep(void)
View file
libretro-picodrive-0~git20200112.tar.xz/platform/linux/pprof.c -> libretro-picodrive-0~git20200716.tar.xz/platform/linux/pprof.c
Changed
@@ -1,21 +1,46 @@ #include <stdio.h> #include <stdlib.h> #include <unistd.h> +#include <fcntl.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/shm.h> +#include <sys/mman.h> #include <pico/pico_int.h> +int rc_mem[pp_total_points]; + struct pp_counters *pp_counters; +int *refcounts = rc_mem; static int shmemid; +static unsigned long devMem; +volatile unsigned long *gp2x_memregl; +volatile unsigned short *gp2x_memregs; + void pprof_init(void) { int this_is_new_shmem = 1; key_t shmemkey; void *shmem; +#if 0 + devMem = open("/dev/mem", O_RDWR); + if (devMem == -1) + { + perror("pprof: open failed"); + return; + } + gp2x_memregl = (unsigned long *)mmap(0, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, devMem, 0xc0000000); + if (gp2x_memregl == (unsigned long *)-1) + { + perror("pprof: mmap failed"); + return; + } + gp2x_memregs = (unsigned short *)gp2x_memregl; +#endif + #ifndef PPROF_TOOL unsigned int tmp = pprof_get_one(); printf("pprof: measured diff is %u\n", pprof_get_one() - tmp); @@ -28,11 +53,11 @@ return; } -#ifndef PPROF_TOOL +//#ifndef PPROF_TOOL shmemid = shmget(shmemkey, sizeof(*pp_counters), IPC_CREAT | IPC_EXCL | 0644); if (shmemid == -1) -#endif +//#endif { shmemid = shmget(shmemkey, sizeof(*pp_counters), 0644); @@ -76,15 +101,18 @@ IT(draw), IT(sound), IT(m68k), + IT(s68k), + IT(mem68), IT(z80), IT(msh2), IT(ssh2), + IT(memsh), IT(dummy), }; int main(int argc, char *argv[]) { - unsigned long long old[pp_total_points], new[pp_total_points]; + pp_type old[pp_total_points], new[pp_total_points]; int base = 0; int l, i; @@ -107,11 +135,12 @@ memcpy(new, pp_counters->counter, sizeof(new)); for (i = 0; i < ARRAY_SIZE(pp_tab); i++) { - unsigned long long idiff = new[i] - old[i]; - unsigned long long bdiff = (new[base] - old[base]) | 1; + pp_type idiff = new[i] - old[i]; + pp_type bdiff = (new[base] - old[base]) | 1; printf("%6.2f ", (double)idiff * 100.0 / bdiff); } printf("\n"); + fflush(stdout); memcpy(old, new, sizeof(old)); if (argc < 3)
View file
libretro-picodrive-0~git20200112.tar.xz/platform/linux/pprof.h -> libretro-picodrive-0~git20200716.tar.xz/platform/linux/pprof.h
Changed
@@ -7,21 +7,22 @@ pp_draw, pp_sound, pp_m68k, + pp_s68k, + pp_mem68, pp_z80, pp_msh2, pp_ssh2, + pp_memsh, pp_dummy, pp_total_points }; -struct pp_counters -{ - unsigned long long counter[pp_total_points]; -}; - extern struct pp_counters *pp_counters; +extern int *refcounts; #ifdef __i386__ +typedef unsigned long long pp_type; + static __attribute__((always_inline)) inline unsigned int pprof_get_one(void) { unsigned long long ret; @@ -31,24 +32,38 @@ #define unglitch_timer(x) #elif defined(__GP2X__) +typedef unsigned long pp_type; + +#if 0 // XXX: MMSP2 only, timer sometimes seems to return lower vals? extern volatile unsigned long *gp2x_memregl; #define pprof_get_one() (unsigned int)gp2x_memregl[0x0a00 >> 2] #define unglitch_timer(di) \ if ((signed int)(di) < 0) di = 0 +#else +extern unsigned int (*gp2x_get_ticks_us)(void); +#define pprof_get_one() gp2x_get_ticks_us() +#define unglitch_timer(di) \ + if ((signed int)(di) < 0) di = 0 +#endif #else #error no timer #endif +struct pp_counters +{ + pp_type counter[pp_total_points]; +}; + #define pprof_start(point) { \ - unsigned int pp_start_##point = pprof_get_one() + unsigned int pp_start_##point = pprof_get_one(); refcounts[pp_##point]++ #define pprof_end(point) \ { \ unsigned int di = pprof_get_one() - pp_start_##point; \ unglitch_timer(di); \ - pp_counters->counter[pp_##point] += di; \ + if (!--refcounts[pp_##point]) pp_counters->counter[pp_##point] += di; \ } \ } @@ -57,7 +72,7 @@ { \ unsigned int di = pprof_get_one() - pp_start_##point; \ unglitch_timer(di); \ - pp_counters->counter[pp_##point] -= di; \ + if (--refcounts[pp_##point]) pp_counters->counter[pp_##point] -= di; \ } \ }
View file
libretro-picodrive-0~git20200112.tar.xz/platform/pandora/Makefile -> libretro-picodrive-0~git20200716.tar.xz/platform/pandora/Makefile
Changed
@@ -13,7 +13,7 @@ all: rel ../../tools/textfilter: ../../tools/textfilter.c - make -C ../../tools/ + make -C ../../tools/ textfilter #readme.txt: ../../tools/textfilter ../base_readme.txt ../../ChangeLog # ../../tools/textfilter ../base_readme.txt $@ PANDORA
View file
libretro-picodrive-0~git20200112.tar.xz/platform/psp/emu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/psp/emu.c
Changed
@@ -201,13 +201,22 @@ //for (i = 0x3f/2; i >= 0; i--) // dpal[i] = ((spal[i]&0x000f000f)<< 1)|((spal[i]&0x00f000f0)<<3)|((spal[i]&0x0f000f00)<<4); - do_pal_convert(localPal, Pico.cram, currentConfig.gamma, currentConfig.gamma2); - - Pico.m.dirtyPal = 0; - need_pal_upload = 1; - - if (allow_sh && (Pico.video.reg[0xC]&8)) // shadow/hilight? + if ((currentConfig.EmuOpt&0x80) || (PicoOpt&0x10)) { + do_pal_convert(localPal, Pico.cram, currentConfig.gamma, currentConfig.gamma2); + Pico.m.dirtyPal = 0; + } + else if (Pico.est.rendstatus&0x20) + { + switch (Pico.est.SonicPalCount) { + case 3: do_pal_convert(localPal+0xc0, Pico.est.SonicPal+0xc0, currentConfig.gamma, currentConfig.gamma2); + case 2: do_pal_convert(localPal+0x80, Pico.est.SonicPal+0x80, currentConfig.gamma, currentConfig.gamma2); + case 1: do_pal_convert(localPal+0x40, Pico.est.SonicPal+0x40, currentConfig.gamma, currentConfig.gamma2); + default:do_pal_convert(localPal, Pico.est.SonicPal, currentConfig.gamma, currentConfig.gamma2); + } + } + else if (allow_sh && (Pico.video.reg[0xC]&8)) // shadow/hilight? { + do_pal_convert(localPal, Pico.est.SonicPal, currentConfig.gamma, currentConfig.gamma2); // shadowed pixels for (i = 0x3f/2; i >= 0; i--) dpal[0x20|i] = dpal[0x60|i] = (dpal[i]>>1)&0x7bcf7bcf; @@ -223,6 +232,16 @@ localPal[0xe0] = 0; localPal[0xf0] = 0x001f; } + else if (allow_as) + { + do_pal_convert(localPal, Pico.est.SonicPal, currentConfig.gamma, currentConfig.gamma2); + memcpy((int *)dpal+0x40/2, (void *)localPal, 0x40*2); + memcpy((int *)dpal+0x80/2, (void *)localPal, 0x80*2); + } + + if (Pico.m.dirtyPal == 2) + Pico.m.dirtyPal = 0; + need_pal_upload = 1; } static void do_slowmode_lines(int line_to) @@ -639,7 +658,7 @@ PicoIn.sndOut += len / 2; /*if (PicoIn.sndOut > sndBuffer_endptr) { - memcpy32((int *)(void *)sndBuffer, (int *)endptr, (PicoIn.sndOut - endptr + 1) / 2); + memcpy((int *)(void *)sndBuffer, (int *)endptr, (PicoIn.sndOut - endptr + 1) * 2); PicoIn.sndOut = &sndBuffer[PicoIn.sndOut - endptr]; lprintf("mov\n"); }
View file
libretro-picodrive-0~git20200112.tar.xz/platform/psp/menu.c -> libretro-picodrive-0~git20200716.tar.xz/platform/psp/menu.c
Changed
@@ -59,7 +59,7 @@ // int i; // for (i = 272; i >= 0; i--, dst += 512, src += 480) - // memcpy32((int *)dst, (int *)src, 480*2/4); + // memcpy((int *)dst, (int *)src, 480*2); sceGuSync(0,0); // sync with prev sceGuStart(GU_DIRECT, guCmdList);
View file
libretro-picodrive-0~git20200112.tar.xz/tools/Makefile -> libretro-picodrive-0~git20200716.tar.xz/tools/Makefile
Changed
@@ -1,13 +1,17 @@ -CFLAGS = -Wall -ggdb +TARGETS = amalgamate textfilter +HOSTCC ?= cc -TARGETS = amalgamate textfilter mkoffsets -OBJS = $(addsuffix .o,$(TARGETS)) +all: + if [ -f "offsets/$(XPLATFORM)-offsets.h" ]; then \ + ln -sf "../tools/offsets/$(XPLATFORM)-offsets.h" ../pico/pico_int_offs.h; \ + else \ + CC="$(XCC)" CFLAGS="$(XCFLAGS)" sh ./mkoffsets.sh ../pico; \ + fi -all: $(TARGETS) +$(TARGETS): $(addsuffix .c,$(TARGETS)) + $(HOSTCC) -o $@ -O $@.c clean: $(RM) $(TARGETS) $(OBJS) -mkoffsets: CFLAGS += -m32 -I.. - .PHONY: clean all
View file
libretro-picodrive-0~git20200716.tar.xz/tools/mkoffsets.sh
Added
@@ -0,0 +1,160 @@ +# automatically compute structure offsets for gcc targets in ELF format +# (C) 2018 Kai-Uwe Bloem. This work is placed in the public domain. +# +# usage: mkoffsets <output dir> + +CC=${CC:-gcc} + +# endianess of target (automagically determined below) +ENDIAN= + +# check which object format to dissect +READELF= +OBJDUMP= +check_obj () +{ + # prepare an object file; as side effect dtermine the endianess + CROSS=$(echo $CC | sed 's/gcc.*//') + echo '#include <stdint.h>' >/tmp/getoffs.c + echo "const int32_t val = 1;" >>/tmp/getoffs.c + $CC $CFLAGS -I .. -c /tmp/getoffs.c -o /tmp/getoffs.o || exit 1 + + # check for readelf; readelf is the only toolchain tool not using bfd, + # hence it works with ELF files for every target + if file /tmp/getoffs.o | grep -q ELF; then + if command -v readelf >/dev/null; then + READELF=readelf + elif command -v ${CROSS}readelf >/dev/null; then + READELF=${CROSS}readelf + fi + fi + if [ -n "$READELF" ]; then + # find the the .rodata section (in case -fdata-sections is used) + rosect=$($READELF -S /tmp/getoffs.o | grep '\.rodata\|\.sdata' | + sed 's/^[^.]*././;s/ .*//') + # read .rodata section as hex string (should be only 4 bytes) + ro=$($READELF -x $rosect /tmp/getoffs.o | grep '0x' | cut -c14-48 | + tr -d ' \n' | cut -c1-8) + # if no output could be read readelf isn't working + if [ -z "$ro" ]; then + READELF= + fi + fi + # if there is no working readelf try using objdump + if [ -z "$READELF" ]; then + # objdump is using bfd; try using the toolchain objdump first + # since this is likely working with the toolchain objects + if command -v ${CROSS}objdump >/dev/null; then + OBJDUMP=${CROSS}objdump + elif command -v objdump >/dev/null; then + OBJDUMP=objdump + fi + # find the start line of the .rodata section; read the next line + ro=$($OBJDUMP -s /tmp/getoffs.o | awk '\ + /Contents of section.*(__const|.r[o]?data|.sdata)/ {o=1; next} \ + {if(o) { gsub(/ .*/,""); $1=""; gsub(/ /,""); print; exit}}') + # no working tool for extracting the ro data; stop here + if [ -z "$ro" ]; then + echo "/* mkoffset.sh: no readelf or not ELF, offset table not created */" >$fn + echo "WARNING: no readelf or not ELF, offset table not created" + exit + fi + fi + # extract decimal value from ro + rodata=$(printf "%d" 0x$ro) + ENDIAN=$(if [ "$rodata" -eq 1 ]; then echo be; else echo le; fi) +} + +# compile with target C compiler and extract value from .rodata section +compile_rodata () +{ + $CC $CFLAGS -I .. -c /tmp/getoffs.c -o /tmp/getoffs.o || exit 1 + if [ -n "$READELF" ]; then + # find the .rodata section (in case -fdata-sections is used) + rosect=$(readelf -S /tmp/getoffs.o | grep '\.rodata\|\.sdata' | + sed 's/^[^.]*././;s/ .*//') + # read .rodata section as hex string (should be only 4 bytes) + ro=$(readelf -x $rosect /tmp/getoffs.o | grep '0x' | cut -c14-48 | + tr -d ' \n' | cut -c1-8) + elif [ -n "$OBJDUMP" ]; then + # find the start line of the .rodata section; read the next line + ro=$($OBJDUMP -s /tmp/getoffs.o | awk '\ + /Contents of section.*(__const|.r[o]?data|.sdata)/ {o=1; next} \ + {if(o) { gsub(/ .*/,""); $1=""; gsub(/ /,""); print; exit}}') + fi + if [ "$ENDIAN" = "le" ]; then + # swap needed for le target + hex="" + for b in $(echo $ro | sed 's/\([0-9a-f]\{2\}\)/\1 /g'); do + hex=$b$hex; + done + else + hex=$ro + fi + # extract decimal value from hex string + rodata=$(printf "%d" 0x$hex) +} + +# determine member offset and create #define +get_define () # prefix struct member member... +{ + prefix=$1; shift + struct=$1; shift + field=$(echo $* | sed 's/ /./g') + name=$(echo $* | sed 's/ /_/g') + echo '#include <stdint.h>' > /tmp/getoffs.c + echo '#include "pico/pico_int.h"' >> /tmp/getoffs.c + echo "static struct $struct p;" >> /tmp/getoffs.c + echo "const int32_t val = (char *)&p.$field - (char*)&p;" >>/tmp/getoffs.c + compile_rodata + line=$(printf "#define %-20s 0x%04x" $prefix$name $rodata) +} + +fn="${1:-.}/pico_int_offs.h" +if echo $CFLAGS | grep -qe -flto; then CFLAGS="$CFLAGS -fno-lto"; fi + +check_obj +# output header +echo "/* autogenerated by mkoffset.sh, do not edit */" >$fn +echo "/* target endianess: $ENDIAN, compiled with: $CC $CFLAGS */" >>$fn +# output offsets +get_define OFS_Pico_ Pico video reg ; echo "$line" >>$fn +get_define OFS_Pico_ Pico m rotate ; echo "$line" >>$fn +get_define OFS_Pico_ Pico m z80Run ; echo "$line" >>$fn +get_define OFS_Pico_ Pico m dirtyPal ; echo "$line" >>$fn +get_define OFS_Pico_ Pico m hardware ; echo "$line" >>$fn +get_define OFS_Pico_ Pico m z80_reset ; echo "$line" >>$fn +get_define OFS_Pico_ Pico m sram_reg ; echo "$line" >>$fn +get_define OFS_Pico_ Pico sv ; echo "$line" >>$fn +get_define OFS_Pico_ Pico sv data ; echo "$line" >>$fn +get_define OFS_Pico_ Pico sv start ; echo "$line" >>$fn +get_define OFS_Pico_ Pico sv end ; echo "$line" >>$fn +get_define OFS_Pico_ Pico sv flags ; echo "$line" >>$fn +get_define OFS_Pico_ Pico rom ; echo "$line" >>$fn +get_define OFS_Pico_ Pico romsize ; echo "$line" >>$fn +get_define OFS_Pico_ Pico est ; echo "$line" >>$fn + +get_define OFS_EST_ PicoEState DrawScanline ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState rendstatus ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState DrawLineDest ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState HighCol ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState HighPreSpr ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState Pico ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState PicoMem_vram ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState PicoMem_cram ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState PicoOpt ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState Draw2FB ; echo "$line" >>$fn +get_define OFS_EST_ PicoEState HighPal ; echo "$line" >>$fn + +get_define OFS_PMEM_ PicoMem vram ; echo "$line" >>$fn +get_define OFS_PMEM_ PicoMem vsram ; echo "$line" >>$fn +get_define OFS_PMEM32x_ Pico32xMem pal_native ; echo "$line" >>$fn + +get_define OFS_SH2_ SH2_ is_slave ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_bios ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_da ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_sdram ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_rom ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_dram ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_drcblk_da ; echo "$line" >>$fn +get_define OFS_SH2_ SH2_ p_drcblk_ram ; echo "$line" >>$fn
View file
libretro-picodrive-0~git20200716.tar.xz/tools/offsets
Added
+(directory)
View file
libretro-picodrive-0~git20200716.tar.xz/tools/offsets/generic-ilp32-offsets.h
Added
@@ -0,0 +1,39 @@ +/* autogenerated by mkoffset.sh, do not edit */ +/* target endianess: le, compiled with: mipsel-linux-gnu-gcc -mabi=32 */ +#define OFS_Pico_video_reg 0x0000 +#define OFS_Pico_m_rotate 0x0040 +#define OFS_Pico_m_z80Run 0x0041 +#define OFS_Pico_m_dirtyPal 0x0046 +#define OFS_Pico_m_hardware 0x0047 +#define OFS_Pico_m_z80_reset 0x004f +#define OFS_Pico_m_sram_reg 0x0049 +#define OFS_Pico_sv 0x008c +#define OFS_Pico_sv_data 0x008c +#define OFS_Pico_sv_start 0x0090 +#define OFS_Pico_sv_end 0x0094 +#define OFS_Pico_sv_flags 0x0098 +#define OFS_Pico_rom 0x0554 +#define OFS_Pico_romsize 0x0558 +#define OFS_Pico_est 0x00c8 +#define OFS_EST_DrawScanline 0x0000 +#define OFS_EST_rendstatus 0x0004 +#define OFS_EST_DrawLineDest 0x0008 +#define OFS_EST_HighCol 0x000c +#define OFS_EST_HighPreSpr 0x0010 +#define OFS_EST_Pico 0x0014 +#define OFS_EST_PicoMem_vram 0x0018 +#define OFS_EST_PicoMem_cram 0x001c +#define OFS_EST_PicoOpt 0x0020 +#define OFS_EST_Draw2FB 0x0024 +#define OFS_EST_HighPal 0x0028 +#define OFS_PMEM_vram 0x10000 +#define OFS_PMEM_vsram 0x22100 +#define OFS_PMEM32x_pal_native 0x90e00 +#define OFS_SH2_is_slave 0x055c +#define OFS_SH2_p_bios 0x0080 +#define OFS_SH2_p_da 0x0084 +#define OFS_SH2_p_sdram 0x0088 +#define OFS_SH2_p_rom 0x008c +#define OFS_SH2_p_dram 0x0090 +#define OFS_SH2_p_drcblk_da 0x0094 +#define OFS_SH2_p_drcblk_ram 0x0098
View file
libretro-picodrive-0~git20200716.tar.xz/tools/offsets/generic-llp64-offsets.h
Added
@@ -0,0 +1,39 @@ +/* autogenerated by mkoffset.sh, do not edit */ +/* target endianess: le, compiled with: x86_64-w64-mingw32-gcc */ +#define OFS_Pico_video_reg 0x0000 +#define OFS_Pico_m_rotate 0x0040 +#define OFS_Pico_m_z80Run 0x0041 +#define OFS_Pico_m_dirtyPal 0x0046 +#define OFS_Pico_m_hardware 0x0047 +#define OFS_Pico_m_z80_reset 0x004f +#define OFS_Pico_m_sram_reg 0x0049 +#define OFS_Pico_sv 0x0090 +#define OFS_Pico_sv_data 0x0090 +#define OFS_Pico_sv_start 0x0098 +#define OFS_Pico_sv_end 0x009c +#define OFS_Pico_sv_flags 0x00a0 +#define OFS_Pico_rom 0x0588 +#define OFS_Pico_romsize 0x0590 +#define OFS_Pico_est 0x00d8 +#define OFS_EST_DrawScanline 0x0000 +#define OFS_EST_rendstatus 0x0004 +#define OFS_EST_DrawLineDest 0x0008 +#define OFS_EST_HighCol 0x0010 +#define OFS_EST_HighPreSpr 0x0018 +#define OFS_EST_Pico 0x0020 +#define OFS_EST_PicoMem_vram 0x0028 +#define OFS_EST_PicoMem_cram 0x0030 +#define OFS_EST_PicoOpt 0x0038 +#define OFS_EST_Draw2FB 0x0040 +#define OFS_EST_HighPal 0x0048 +#define OFS_PMEM_vram 0x10000 +#define OFS_PMEM_vsram 0x22100 +#define OFS_PMEM32x_pal_native 0x90e00 +#define OFS_SH2_is_slave 0x0a18 +#define OFS_SH2_p_bios 0x0098 +#define OFS_SH2_p_da 0x00a0 +#define OFS_SH2_p_sdram 0x00a8 +#define OFS_SH2_p_rom 0x00b0 +#define OFS_SH2_p_dram 0x00b8 +#define OFS_SH2_p_drcblk_da 0x00c0 +#define OFS_SH2_p_drcblk_ram 0x00c8
View file
libretro-picodrive-0~git20200716.tar.xz/tools/offsets/generic-lp64-offsets.h
Added
@@ -0,0 +1,39 @@ +/* autogenerated by mkoffset.sh, do not edit */ +/* target endianess: le, compiled with: mipsel-linux-gnu-gcc -mabi=64 */ +#define OFS_Pico_video_reg 0x0000 +#define OFS_Pico_m_rotate 0x0040 +#define OFS_Pico_m_z80Run 0x0041 +#define OFS_Pico_m_dirtyPal 0x0046 +#define OFS_Pico_m_hardware 0x0047 +#define OFS_Pico_m_z80_reset 0x004f +#define OFS_Pico_m_sram_reg 0x0049 +#define OFS_Pico_sv 0x0090 +#define OFS_Pico_sv_data 0x0090 +#define OFS_Pico_sv_start 0x0098 +#define OFS_Pico_sv_end 0x009c +#define OFS_Pico_sv_flags 0x00a0 +#define OFS_Pico_rom 0x0588 +#define OFS_Pico_romsize 0x0590 +#define OFS_Pico_est 0x00d8 +#define OFS_EST_DrawScanline 0x0000 +#define OFS_EST_rendstatus 0x0004 +#define OFS_EST_DrawLineDest 0x0008 +#define OFS_EST_HighCol 0x0010 +#define OFS_EST_HighPreSpr 0x0018 +#define OFS_EST_Pico 0x0020 +#define OFS_EST_PicoMem_vram 0x0028 +#define OFS_EST_PicoMem_cram 0x0030 +#define OFS_EST_PicoOpt 0x0038 +#define OFS_EST_Draw2FB 0x0040 +#define OFS_EST_HighPal 0x0048 +#define OFS_PMEM_vram 0x10000 +#define OFS_PMEM_vsram 0x22100 +#define OFS_PMEM32x_pal_native 0x90e00 +#define OFS_SH2_is_slave 0x0a18 +#define OFS_SH2_p_bios 0x0098 +#define OFS_SH2_p_da 0x00a0 +#define OFS_SH2_p_sdram 0x00a8 +#define OFS_SH2_p_rom 0x00b0 +#define OFS_SH2_p_dram 0x00b8 +#define OFS_SH2_p_drcblk_da 0x00c0 +#define OFS_SH2_p_drcblk_ram 0x00c8
Locations
Projects
Search
Status Monitor
Help
Open Build Service
OBS Manuals
API Documentation
OBS Portal
Reporting a Bug
Contact
Mailing List
Forums
Chat (IRC)
Twitter
Open Build Service (OBS)
is an
openSUSE project
.